Top Banner
Annapolis Wildstar FPGA Board Charles Ross Monica Chawathe
57

Annapolis Wildstar FPGA Board

Feb 02, 2016

Download

Documents

Chase

Annapolis Wildstar FPGA Board. Charles Ross Monica Chawathe. Wildstar Board. Starfire Board. WildStar Board (Simplified). 2M. 2M. 2M. 2M. 1M. 1M. Virtex 2000E “1”. Virtex 2000E “0”. Virtex 2000E “2”. Host. 1M. 1M. 2M. 2M. 2M. 2M. LAD Bus. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Annapolis Wildstar FPGA Board

Annapolis Wildstar FPGA Board

Charles RossMonica Chawathe

Page 2: Annapolis Wildstar FPGA Board

Wildstar Board

Page 3: Annapolis Wildstar FPGA Board

Starfire Board

Page 4: Annapolis Wildstar FPGA Board

Virtex2000E

“1”

Virtex2000E

“2”

Virtex2000E

“0”

1M

1M 1M

1M

2M2M

2M2M

2M2M

2M2M

Host

3 Virtex 2000E FPGAs, 12 Memories (20 MB)

WildStar Board (Simplified)

LAD Bus

Page 5: Annapolis Wildstar FPGA Board

Host

LAD Bus

Page 6: Annapolis Wildstar FPGA Board

Virtex1000“1”

1M

1M1M1M

1M1M

Host

1 Virtex 1000 FPGA, 6 Memories (6 MB)

StarFire Board (Simplified)

LAD Bus

Page 7: Annapolis Wildstar FPGA Board

Memory Layout Local

Always 32-bit words Two on PE 1 Two on PE 2

Mezzanine 32 or 64, depending on source (PEx / PE0)

Both address and word size 4 between PE 1 & 0 4 between PE 2 & 0

Latency: 4 cycles

Page 8: Annapolis Wildstar FPGA Board

Mezzanine Memory 32 vs 64 (Same memory) Switch Modes

00 Straight 01 Crossed 10 Lo Thru 11 Hi Thru

MemMem

PEx PE0

64 32

Page 9: Annapolis Wildstar FPGA Board

PEx (1 and 2)

Right

Left

RightLocal

LeftLocal

RightMezz

LeftMezz

LAD

STUFF

Page 10: Annapolis Wildstar FPGA Board

PE0

Right

Left

PE1RightMezz

PE1LeftMezz

PE2RightMezz

PE2LeftMezz

LAD

STUFF

Page 11: Annapolis Wildstar FPGA Board

Clocks – 4 of them!? K, M, P, U

KClock LAD Transactions (K?) MClock Memory Transactions PClock Processing Clock UClock User Clock

Okay, but why? What are they?

Page 12: Annapolis Wildstar FPGA Board

KClock – LAD PE Host 33MHz or 66MHz

33MHz – Easy to Place and Route 66MHz – 2X Host Bandwidth Host and Chip must agree!!

Set in VHDL and Host Code Clock is actually based on PCI Clock

Varies per host Ours is approx. 33.23MHz / 66.46MHz

Asynchronous to all other clocks

Page 13: Annapolis Wildstar FPGA Board

MClock – Memory Speed of Memory IO

Both Local & Mezzanine User Selectable

25MHz – 133MHz Wildstar 25MHz – 100MHz Starfire

Page 14: Annapolis Wildstar FPGA Board

PClock – Processing Based on MClock

Divisor between 1-16 Slower than MClock (Or Equal)

Can “Speed up” Memory I/O Decoupling may allow different Speeds Increase M, increase Divisor Ex: Slow Component in Application (30MHz)

M=30Mhz & Divisor = 1 P=30MHz M=60Mhz & Divisor = 2 P=30MHz

2 Memory Accesses per Clock

Page 15: Annapolis Wildstar FPGA Board

PClock – Processing (More) Optional

We normally don’t use it for ease MClock is used Directly

Less Logic than “P=M/1” No need to jump Clock Boundaries

Chip must either Not care what the ratio is Know at compile what ratio will be

Page 16: Annapolis Wildstar FPGA Board

UClock – User Clock User Selectable

0.32MHz – 133MHz Wildstar 0.32MHz – 100MHz Starfire

We have never used it 3 is plenty, isn't it?

Asynchronous to all other clocks

Page 17: Annapolis Wildstar FPGA Board

Hardware Components Roll your own

Manual LAD addressing (33/66 Differ) Manual Memory use Contention Manual EVERYTHING! CAN be very fast ~140 MHz

Annapolis Supplied Components MUCH Easier Slower (Approx. 40-60 MHz)

Page 18: Annapolis Wildstar FPGA Board

LAD Bus 33MHz / 66MHz Selectable

Changes the communication protocol Amt of Latency, etc..

Component Addressing scheme 0x0000-0x7FFF – Component Within PE Higher Bits Address Board and PE

Ignore them unless you “roll your own” LAD code

Page 19: Annapolis Wildstar FPGA Board

LAD Bus (More) The Addressing of the LAD bus

A lot like subnet masks in IP Networking MASK

Which bits address the component Which bits are intra-component

BASE Where does this component begin

ADDR&MASK==BASE “Are you talkin’ to ME?” ADDR&(~MASK) = “What address in me?”

Examples: B: 0x4800 M:0x7F00 0x4800 ~ 0x48FF B: 0x3200 M:0x7C00 0x3200 ~ 0x35FF

Page 20: Annapolis Wildstar FPGA Board

Inside the Chips

SomeMemory

LAD

MemMux

LAD-MemBridge

LADMux

RegFileReset

Clocks

SomeMemory

MemMux

LAD-MemBridge

Your Your ApplicatiApplicati

onon

.....

..........

AnnapolisProvided

UserProvided

Page 21: Annapolis Wildstar FPGA Board

LAD-MUX

SomeMemory

LAD

MemMux

LAD-MemBridge

LADMux

RegFileReset

Clocks

SomeMemory

MemMux

LAD-MemBridge

Your Your ApplicatiApplicati

onon

.....

..........

Page 22: Annapolis Wildstar FPGA Board

LAD-MUX Gives LAD access to components

Bridges gap between IO Pins and “Logical” LAD

Handles Protocols for you 66 and 33

ONE per chip

Page 23: Annapolis Wildstar FPGA Board

Reset

SomeMemory

LAD

MemMux

LAD-MemBridge

LADMux

RegFileReset

Clocks

SomeMemory

MemMux

LAD-MemBridge

Your Your ApplicatiApplicati

onon

.....

..........

Page 24: Annapolis Wildstar FPGA Board

Reset Allows Host to RESET the Chip

Causes clocks to destabilize momentarily

Causes chip to return to known init state

(If you write your VHDL right) All Annapolis components are written

right

Page 25: Annapolis Wildstar FPGA Board

Clocks

SomeMemory

LAD

MemMux

LAD-MemBridge

LADMux

RegFileReset

Clocks

SomeMemory

MemMux

LAD-MemBridge

Your Your ApplicatiApplicati

onon

.....

..........

Page 26: Annapolis Wildstar FPGA Board

Clocks Provides user access to

All 4 Clocks (or Clock x2) When clocks are stable

“DLL locked” Signals

Clocks on a Virtex use DLLs Delay-Locked Loop not Dynamic Link Library

Shame on you windows users!

Page 27: Annapolis Wildstar FPGA Board

Register File

SomeMemory

LAD

MemMux

LAD-MemBridge

LADMux

RegFileReset

Clocks

SomeMemory

MemMux

LAD-MemBridge

Your Your ApplicatiApplicati

onon

.....

..........

Page 28: Annapolis Wildstar FPGA Board

Register File Provides host access to 1-D array

of 32-bit registers Size must be a power of 2

Can be used for: Ready – “The host says I can go now” Done – “Hey Host, I am done!” Small 32-bit IO – “The answer is 42!” Run time parameters – “Threshold is

63”

Page 29: Annapolis Wildstar FPGA Board

LAD to Mem Bridge

SomeMemory

LAD

MemMux

LAD-MemBridge

LADMux

RegFileReset

Clocks

SomeMemory

MemMux

LAD-MemBridge

Your Your ApplicatiApplicati

onon

.....

..........

Page 30: Annapolis Wildstar FPGA Board

LAD to Mem Bridge Provides host with access to the

memories Mezzanine or Local Memories 2 Kinds, 32 and 64

Transfers happen in bursts 256 DWORDS for 32 bit memories 512 DWORDS for 64 bit memories (its all transparent to the user though)

Page 31: Annapolis Wildstar FPGA Board

Memory-Mux

SomeMemory

LAD

MemMux

LAD-MemBridge

LADMux

RegFileReset

Clocks

SomeMemory

MemMux

LAD-MemBridge

Your Your ApplicatiApplicati

onon

.....

..........

Page 32: Annapolis Wildstar FPGA Board

Memory-Mux Provide multiple clients with access to

the memories Arbitrates between clients

Priority Number of the client decides priority Maximum utilization Might starve some clients

Fair Round Robin Wastes some cycles Each Client gets 1/n

Page 33: Annapolis Wildstar FPGA Board

Memory Access Address of DWORD or QWORD Data_Out To Memory Data_In From Memory Write Direction of Request Request “I want memory” Acknowledge “Okay!” Data_Valid 4/5 Cycle Delayed Ack (See Bugs Later)

32 bit Memories Only Low/High Enable “This half is useful”

64 bit Memories Only High/Low_Data_Valid 4/5 Cycle Delayed (Ack & Low/High Enable)

64 bit Memories Only

Page 34: Annapolis Wildstar FPGA Board

32-bit Memory Read

Page 35: Annapolis Wildstar FPGA Board

64-bit Memory Read

Page 36: Annapolis Wildstar FPGA Board

32-bit Memory Write

Page 37: Annapolis Wildstar FPGA Board

64-bit Memory Write

Page 38: Annapolis Wildstar FPGA Board

Others - Useful RAM Blocks

Host and Client Access to on-chip memories 256 32-bit words

Interrupts to host Systolic Buses

2 36-bit busses between PE1 and PE2 top and bottom

Bi-directional Tri-state

PE0 Standard Buses 2 2-bit busses between PE0 and Pex Bi-directional

Tri-state

Page 39: Annapolis Wildstar FPGA Board

Others – Useless LED (there are 2 LEDs per Chip)

Red and Green Cant see them…

IO Card 114 bit IO We don’t have one

Test Pins 18 bits No testing our board, please! =)

Page 40: Annapolis Wildstar FPGA Board

Software API Annapolis Supplied Driver Functions

Open, Close, Set Clocks, DMA, Read, Write, Download Configurations, Interrupt, Readback, etc..

Convenience Functions Interface code to the

“Lad to Memory Bridges”

Page 41: Annapolis Wildstar FPGA Board

Open/Close Grabs the board exclusively

Uses kernel mutex CAN do it in shared mode, but DONT

Can set LAD Speed as well See “Bugs” Later

Page 42: Annapolis Wildstar FPGA Board

Chip Configuration Programs a PE from a memory array containing

the bitstream x86 files

Can de-program as well Why bother?

As long as everyone “Plays nice”

BE CAREFUL WHAT YOU PROGRAM! if you program a PE with a bitstream that is corrupted,

or not for the correct chip, or mangled in some way you can release the magic smoke from the chips!

$40,000 board!

Page 43: Annapolis Wildstar FPGA Board

Set Clock Speeds UClock speed MClock speed

and PClock divisor

Page 44: Annapolis Wildstar FPGA Board

Register IO Reads/Writes to the LAD Address

space to communicate with anything

plugged into a LAD MUX Reset Register Files Etc.

Page 45: Annapolis Wildstar FPGA Board

Memory IO for LAD to MEM Bridges Abstracts the IO Bursts,

addressing, etc. Create Memory Objects Read/Write/Copy/Set Release

Page 46: Annapolis Wildstar FPGA Board

Others You Wont Need Display (4 Char LCD on the board) Interrupts Temperature / Power Readback / Singleshot DMA Versions / Hardware Config Etc..

Page 47: Annapolis Wildstar FPGA Board

Tools You write Host code (in c)

compile with gcc, etc. Link in the libraries and such

You write Chip code (in VHDL) Simulate and Verify with ModelSim Synthesize with Synplify

Linux / Solaris / WinNT Place and Route with Xilinx foundation tools

WinNT / Linux (with wine)

Page 48: Annapolis Wildstar FPGA Board

ModelSim VHDL Simulation tool Annapolis provides

Host simulation components VHDL Description of the WHOLE board

LAD Memories (Local & Mezzanine) Busses Etc

You provide VHDL to run inside the chip

(May contain Annapolis components as well) Talk to me if you want to use ModelSim to debug!

Page 49: Annapolis Wildstar FPGA Board

Synplify Synplicity Inc. Converts VHDL (or Verilog) into an

EDIF EDIF = description of your program in

terms of virtex parts (4 input LUTs, FlipFlops, Ramblocks, Etc)

Fast 1-30 minutes

Page 50: Annapolis Wildstar FPGA Board

Place and Route Maps to lower level components Lays them out Routes between them Slow

10 minutes – 2 days Provides a bitstream (.bit file)

directly converted to .x86 for config

Page 51: Annapolis Wildstar FPGA Board

Paths & Environment Need environment variables and path

additions add this to the end of your your .cshrc:

source ~cs670/WildExamples/cshrc_additions

If you use bash, sh, zsh, etc.. You’re on your own! Look at the file, figure it out!

OR Use csh or tcsh!

Page 52: Annapolis Wildstar FPGA Board

Examples ~cs670/WildExamples/csu_example

Basic CSU made example using only PE1 Copies 1Mb from Left Right Local Mem

~cs670/WildExamples/annap_example All the Annapolis supplied examples May need path adjusting, etc.. Not meant to work as is Useful to get a feel for other stuff

Page 53: Annapolis Wildstar FPGA Board

Hints Timing

Count MClock, and put it in a RegFile Cycles / Freq = Time

Host timing is too coarse “Start / Stop” and “Working /

Done” Use a RegFile – Easier than Interrupts

(Haven’t gotten them to work with LAD Mux)

Page 54: Annapolis Wildstar FPGA Board

Manuals Ask Sanjay! =)

1 copy of our HUGE Starfire / Wildstar manuals

I have the original… You may use it near my desk… If it wanders from my cube

Broken Legs

Page 55: Annapolis Wildstar FPGA Board

HELP! Bugs? - “99% correct is 100% Wrong”

1 – Reread your VHDL and host code Silly bugs are easy to make, and spot

2 – Simulate it You can see the signals. It almost always agrees

with the actual hardware 3 – Simulate again

No Really… Simulate it! 4 – Look in the manuals

Helpful sometimes… 5 – [email protected]

Page 56: Annapolis Wildstar FPGA Board

BUGS!!!! Querying the LAD bus speed in host code

will return 66MHz if the LAD Bus was *EVER* at 66MHz since last reboot… even if it is *CURRENTLY* at 33MHz!

DON’T USE IT, EVER! The Data_Valid Signals are WRONG! They

appear to be delayed 5 cycles instead of 4 in the real code. They are correct in simulation.

Use a 4 cycle delay on (Req and Ack) Instead! Use the simulation to ensure your delayed signal

matches

Page 57: Annapolis Wildstar FPGA Board

Lets Look at it! Lemme open emacs… VHDL Host Code Execution Simulation

Little Wiggly Green Wires!