Top Banner
1. INTRODUCTION The term Video Graphics Array (VGA) refers specifically to the display hardware first introduced with the IBM. VGA is referred to as an "array" instead of an "adapter" because it was implemented from the start as a single chip replacing the dozens of discreet logic chips. The VGA Controller provides a simple, interface between a host processor and any VGA- compatible monitor. Taking a processor-generated picture (pixilated) from memory space, the Controller provides digital RGB values for each pixel, as well as horizontal and vertical synchronization signals, in order to correctly display the picture on a connected monitor. 1
95
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: vga

1 INTRODUCTION

The term Video Graphics Array (VGA) refers specifically to the display

hardware first introduced with the IBM VGA is referred to as an array

instead of an adapter because it was implemented from the start as a single

chip replacing the dozens of discreet logic chips

The VGA Controller provides a simple interface between a host processor

and any VGA-compatible monitor

Taking a processor-generated picture (pixilated) from memory space the

Controller provides digital RGB values for each pixel as well as horizontal

and vertical synchronization signals in order to correctly display the picture on

a connected monitor

1

2 HISTORY

The term Video Graphics Array (VGA) refers specifically to the display

hardware first introduced with the IBM PS2 line of computers in 1987 but

through its widespread adoption has also come to mean either an analog

computer display standard the 15-pin D-subminiature VGA connector or the

640times480 resolution itself While this resolution has been superseded in the

personal computer market it is becoming a popular resolution on mobile

devices

Video Graphics Array (VGA) was the last graphical standard introduced by

IBM that the majority of PC clone manufacturers conformed to making it

today (as of 2009) the lowest common denominator that all PC graphics

hardware supports before a device-specific driver is loaded into the computer

For example the MS-Windows splash screen appears while the machine is still

operating in VGA mode which is the reason that this screen always appears in

reduced resolution and color depth

VGA was officially superseded by IBMs XGA standard but in reality it was

superseded by numerous slightly different extensions to VGA made by clone

manufacturers that came to be known collectively as Super VGA

21 VGA compared to other standard resolutions

VGA is referred to as an array instead of an adapter because it was

implemented from the start as a single chip (an ASIC) replacing the Motorola

6845 and dozens of discrete logic chips that covered the full-length ISA boards

of the MDA CGA and EGA Its single-chip implementation also allowed the

VGA to be placed directly on a PCs motherboard with a minimum of

difficulty (it only required video memory timing crystals and an external

RAMDAC) and the first IBM PS2 models were equipped with VGA on the

motherboard

2

22 The VGA specifications are as follows

256 KB Video RAM (The very first cards could be ordered with 64KB or

128KB of RAM at the cost of losing some video modes)

16-color and 256-color modes

262144-value color palette (six bits each for red green and blue)

Selectable 25175 MHz or 28322 MHz master clock

Maximum of 800 horizontal pixels

Maximum of 600 lines

Refresh rates at up to 70 Hz

Vertical blank interrupt

Packed-pixel mode 256 colors

Hardware smooth scrolling support

The VGA supports both All Points Addressable graphics modes and

alphanumeric text modes Standard graphics modes are

640times480 in 16 colors

640times350 in 16 colors

320times200 in 16 colors

320times200 in 256 colors

3

3 DESCRIPTION ARCHITECTURE

31 VGA display device

VGA display device used for displaying the images taken from the

processor in an exact format It can be used for displaying the output of any

projects Analog graphs etchellip can be directly displayed on the VGA display

monitor

31 VGA (Video Graphics Array) Controller

VGA controller controls the image to be exactly displayed on the display

device It acts as an interface between processing system and display device so

called as interface controller

4

Image or video processing system

VGA Controller

Display

4 CONTROLLER DESIGN

For this first we should know how the display device works Consider an

image below

640x480

Original image

5

Pixels of image

Display device also have pixels Display resolution (640480) is a

standard resolution of display

Pixel

This article is about the picture element For other uses see Pixel

(disambiguation)

This example shows an image with a portion greatly enlarged in which the

individual pixels are rendered as little squares and can easily be seen

6

42 A photograph of sub-pixel display elements on a laptops LCD screen

In digital imaging a pixel (or picture element [1]) is a single point in a raster

image The pixel is the smallest addressable screen element it is the smallest

unit of picture which can be controlled Each Pixel has its address The address

of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

dimensional grid and are often represented using dots or squares Each pixel is

a sample of an original image where more samples typically provide more-

accurate representations of the original The intensity of each pixel is variable

In color image systems a color is typically represented by three or four

component intensities such as red green and blue or cyan magenta yellow

and black

Color refreshing rate

Color refreshing rate =60Hz

Color refreshing rate represents the number of frames that are transmitted per

second

Minimum number of frames that can be transmitted per second by VGA

controller is 60

7

5 Hardware Description

51 Block Diagram

51 VGA Synchronization Unit

8

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 2: vga

2 HISTORY

The term Video Graphics Array (VGA) refers specifically to the display

hardware first introduced with the IBM PS2 line of computers in 1987 but

through its widespread adoption has also come to mean either an analog

computer display standard the 15-pin D-subminiature VGA connector or the

640times480 resolution itself While this resolution has been superseded in the

personal computer market it is becoming a popular resolution on mobile

devices

Video Graphics Array (VGA) was the last graphical standard introduced by

IBM that the majority of PC clone manufacturers conformed to making it

today (as of 2009) the lowest common denominator that all PC graphics

hardware supports before a device-specific driver is loaded into the computer

For example the MS-Windows splash screen appears while the machine is still

operating in VGA mode which is the reason that this screen always appears in

reduced resolution and color depth

VGA was officially superseded by IBMs XGA standard but in reality it was

superseded by numerous slightly different extensions to VGA made by clone

manufacturers that came to be known collectively as Super VGA

21 VGA compared to other standard resolutions

VGA is referred to as an array instead of an adapter because it was

implemented from the start as a single chip (an ASIC) replacing the Motorola

6845 and dozens of discrete logic chips that covered the full-length ISA boards

of the MDA CGA and EGA Its single-chip implementation also allowed the

VGA to be placed directly on a PCs motherboard with a minimum of

difficulty (it only required video memory timing crystals and an external

RAMDAC) and the first IBM PS2 models were equipped with VGA on the

motherboard

2

22 The VGA specifications are as follows

256 KB Video RAM (The very first cards could be ordered with 64KB or

128KB of RAM at the cost of losing some video modes)

16-color and 256-color modes

262144-value color palette (six bits each for red green and blue)

Selectable 25175 MHz or 28322 MHz master clock

Maximum of 800 horizontal pixels

Maximum of 600 lines

Refresh rates at up to 70 Hz

Vertical blank interrupt

Packed-pixel mode 256 colors

Hardware smooth scrolling support

The VGA supports both All Points Addressable graphics modes and

alphanumeric text modes Standard graphics modes are

640times480 in 16 colors

640times350 in 16 colors

320times200 in 16 colors

320times200 in 256 colors

3

3 DESCRIPTION ARCHITECTURE

31 VGA display device

VGA display device used for displaying the images taken from the

processor in an exact format It can be used for displaying the output of any

projects Analog graphs etchellip can be directly displayed on the VGA display

monitor

31 VGA (Video Graphics Array) Controller

VGA controller controls the image to be exactly displayed on the display

device It acts as an interface between processing system and display device so

called as interface controller

4

Image or video processing system

VGA Controller

Display

4 CONTROLLER DESIGN

For this first we should know how the display device works Consider an

image below

640x480

Original image

5

Pixels of image

Display device also have pixels Display resolution (640480) is a

standard resolution of display

Pixel

This article is about the picture element For other uses see Pixel

(disambiguation)

This example shows an image with a portion greatly enlarged in which the

individual pixels are rendered as little squares and can easily be seen

6

42 A photograph of sub-pixel display elements on a laptops LCD screen

In digital imaging a pixel (or picture element [1]) is a single point in a raster

image The pixel is the smallest addressable screen element it is the smallest

unit of picture which can be controlled Each Pixel has its address The address

of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

dimensional grid and are often represented using dots or squares Each pixel is

a sample of an original image where more samples typically provide more-

accurate representations of the original The intensity of each pixel is variable

In color image systems a color is typically represented by three or four

component intensities such as red green and blue or cyan magenta yellow

and black

Color refreshing rate

Color refreshing rate =60Hz

Color refreshing rate represents the number of frames that are transmitted per

second

Minimum number of frames that can be transmitted per second by VGA

controller is 60

7

5 Hardware Description

51 Block Diagram

51 VGA Synchronization Unit

8

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 3: vga

22 The VGA specifications are as follows

256 KB Video RAM (The very first cards could be ordered with 64KB or

128KB of RAM at the cost of losing some video modes)

16-color and 256-color modes

262144-value color palette (six bits each for red green and blue)

Selectable 25175 MHz or 28322 MHz master clock

Maximum of 800 horizontal pixels

Maximum of 600 lines

Refresh rates at up to 70 Hz

Vertical blank interrupt

Packed-pixel mode 256 colors

Hardware smooth scrolling support

The VGA supports both All Points Addressable graphics modes and

alphanumeric text modes Standard graphics modes are

640times480 in 16 colors

640times350 in 16 colors

320times200 in 16 colors

320times200 in 256 colors

3

3 DESCRIPTION ARCHITECTURE

31 VGA display device

VGA display device used for displaying the images taken from the

processor in an exact format It can be used for displaying the output of any

projects Analog graphs etchellip can be directly displayed on the VGA display

monitor

31 VGA (Video Graphics Array) Controller

VGA controller controls the image to be exactly displayed on the display

device It acts as an interface between processing system and display device so

called as interface controller

4

Image or video processing system

VGA Controller

Display

4 CONTROLLER DESIGN

For this first we should know how the display device works Consider an

image below

640x480

Original image

5

Pixels of image

Display device also have pixels Display resolution (640480) is a

standard resolution of display

Pixel

This article is about the picture element For other uses see Pixel

(disambiguation)

This example shows an image with a portion greatly enlarged in which the

individual pixels are rendered as little squares and can easily be seen

6

42 A photograph of sub-pixel display elements on a laptops LCD screen

In digital imaging a pixel (or picture element [1]) is a single point in a raster

image The pixel is the smallest addressable screen element it is the smallest

unit of picture which can be controlled Each Pixel has its address The address

of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

dimensional grid and are often represented using dots or squares Each pixel is

a sample of an original image where more samples typically provide more-

accurate representations of the original The intensity of each pixel is variable

In color image systems a color is typically represented by three or four

component intensities such as red green and blue or cyan magenta yellow

and black

Color refreshing rate

Color refreshing rate =60Hz

Color refreshing rate represents the number of frames that are transmitted per

second

Minimum number of frames that can be transmitted per second by VGA

controller is 60

7

5 Hardware Description

51 Block Diagram

51 VGA Synchronization Unit

8

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 4: vga

3 DESCRIPTION ARCHITECTURE

31 VGA display device

VGA display device used for displaying the images taken from the

processor in an exact format It can be used for displaying the output of any

projects Analog graphs etchellip can be directly displayed on the VGA display

monitor

31 VGA (Video Graphics Array) Controller

VGA controller controls the image to be exactly displayed on the display

device It acts as an interface between processing system and display device so

called as interface controller

4

Image or video processing system

VGA Controller

Display

4 CONTROLLER DESIGN

For this first we should know how the display device works Consider an

image below

640x480

Original image

5

Pixels of image

Display device also have pixels Display resolution (640480) is a

standard resolution of display

Pixel

This article is about the picture element For other uses see Pixel

(disambiguation)

This example shows an image with a portion greatly enlarged in which the

individual pixels are rendered as little squares and can easily be seen

6

42 A photograph of sub-pixel display elements on a laptops LCD screen

In digital imaging a pixel (or picture element [1]) is a single point in a raster

image The pixel is the smallest addressable screen element it is the smallest

unit of picture which can be controlled Each Pixel has its address The address

of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

dimensional grid and are often represented using dots or squares Each pixel is

a sample of an original image where more samples typically provide more-

accurate representations of the original The intensity of each pixel is variable

In color image systems a color is typically represented by three or four

component intensities such as red green and blue or cyan magenta yellow

and black

Color refreshing rate

Color refreshing rate =60Hz

Color refreshing rate represents the number of frames that are transmitted per

second

Minimum number of frames that can be transmitted per second by VGA

controller is 60

7

5 Hardware Description

51 Block Diagram

51 VGA Synchronization Unit

8

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 5: vga

4 CONTROLLER DESIGN

For this first we should know how the display device works Consider an

image below

640x480

Original image

5

Pixels of image

Display device also have pixels Display resolution (640480) is a

standard resolution of display

Pixel

This article is about the picture element For other uses see Pixel

(disambiguation)

This example shows an image with a portion greatly enlarged in which the

individual pixels are rendered as little squares and can easily be seen

6

42 A photograph of sub-pixel display elements on a laptops LCD screen

In digital imaging a pixel (or picture element [1]) is a single point in a raster

image The pixel is the smallest addressable screen element it is the smallest

unit of picture which can be controlled Each Pixel has its address The address

of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

dimensional grid and are often represented using dots or squares Each pixel is

a sample of an original image where more samples typically provide more-

accurate representations of the original The intensity of each pixel is variable

In color image systems a color is typically represented by three or four

component intensities such as red green and blue or cyan magenta yellow

and black

Color refreshing rate

Color refreshing rate =60Hz

Color refreshing rate represents the number of frames that are transmitted per

second

Minimum number of frames that can be transmitted per second by VGA

controller is 60

7

5 Hardware Description

51 Block Diagram

51 VGA Synchronization Unit

8

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 6: vga

Pixels of image

Display device also have pixels Display resolution (640480) is a

standard resolution of display

Pixel

This article is about the picture element For other uses see Pixel

(disambiguation)

This example shows an image with a portion greatly enlarged in which the

individual pixels are rendered as little squares and can easily be seen

6

42 A photograph of sub-pixel display elements on a laptops LCD screen

In digital imaging a pixel (or picture element [1]) is a single point in a raster

image The pixel is the smallest addressable screen element it is the smallest

unit of picture which can be controlled Each Pixel has its address The address

of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

dimensional grid and are often represented using dots or squares Each pixel is

a sample of an original image where more samples typically provide more-

accurate representations of the original The intensity of each pixel is variable

In color image systems a color is typically represented by three or four

component intensities such as red green and blue or cyan magenta yellow

and black

Color refreshing rate

Color refreshing rate =60Hz

Color refreshing rate represents the number of frames that are transmitted per

second

Minimum number of frames that can be transmitted per second by VGA

controller is 60

7

5 Hardware Description

51 Block Diagram

51 VGA Synchronization Unit

8

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 7: vga

42 A photograph of sub-pixel display elements on a laptops LCD screen

In digital imaging a pixel (or picture element [1]) is a single point in a raster

image The pixel is the smallest addressable screen element it is the smallest

unit of picture which can be controlled Each Pixel has its address The address

of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

dimensional grid and are often represented using dots or squares Each pixel is

a sample of an original image where more samples typically provide more-

accurate representations of the original The intensity of each pixel is variable

In color image systems a color is typically represented by three or four

component intensities such as red green and blue or cyan magenta yellow

and black

Color refreshing rate

Color refreshing rate =60Hz

Color refreshing rate represents the number of frames that are transmitted per

second

Minimum number of frames that can be transmitted per second by VGA

controller is 60

7

5 Hardware Description

51 Block Diagram

51 VGA Synchronization Unit

8

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 8: vga

5 Hardware Description

51 Block Diagram

51 VGA Synchronization Unit

8

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 9: vga

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 10: vga

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 11: vga

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 12: vga

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 13: vga

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 14: vga

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 15: vga

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 16: vga

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 17: vga

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 18: vga

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 19: vga

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 20: vga

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 21: vga

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 22: vga

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 23: vga

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 24: vga

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 25: vga

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 26: vga

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 27: vga

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 28: vga

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 29: vga

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 30: vga

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 31: vga

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 32: vga

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 33: vga

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 34: vga

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 35: vga

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 36: vga

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 37: vga

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 38: vga

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 39: vga

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 40: vga

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 41: vga

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 42: vga

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 43: vga

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 44: vga

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 45: vga

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 46: vga

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 47: vga

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 48: vga

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 49: vga

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 50: vga

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 51: vga

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 52: vga

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 53: vga

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 54: vga

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 55: vga

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 56: vga

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 57: vga

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 58: vga

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 59: vga

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 60: vga

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 61: vga

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 62: vga

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 63: vga

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 64: vga

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 65: vga

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 66: vga

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 67: vga

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 68: vga

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 69: vga

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 70: vga

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons
Page 71: vga

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons