Top Banner
BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007
16

June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

BEE3 Update

Chuck ThackerJohn Davis

Microsoft Research10 June, 2007

Page 2: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

Outline

• What is BEE3?

• BEE3 board properties

• BEE3 gateware

• BEE3 schedule

Page 3: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

What is BEE3?

• Follow-on to BEE2 (BWRC, 2004)• Board with several highly-connected FPGAs• Vehicle for computer architecture research

– Microsoft’s primary interest

• Potential platform for high performance DSP applications– Astronomers, and perhaps others.

• Allows large scale architectural experiments– Although perhaps not as large as originally hoped– And certainly not at the speed of a real implementation

• Can scale smoothly from a single board to 64 boards (256 FPGAs)

Page 4: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

BEE2

Page 5: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

BEE2 – BEE3 Differences

• 4 Xilinx Virtex 5 vs 5 Virtex 2 Pro FPGAs– We use XC5VLX110T-ff1136– V2Pro is now obsolete (130nm)– V5 is a major improvement (65nm)

• 6-input LUT (64 bit DP RAM)• Better Block RAMs• Improved interconnect• Better signal integrity

• 8 Infiniband/CX4 channels vs 18• 4 x8 PCI Express Low Profile slots

Page 6: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

BEE3 – BEE2 Differences (2)• 2 Banks DDR2 x 2 vs 4 Banks DDR2 x 1

– 64 GB capacity with 4GB DIMMs– Lower total bandwidth, but higher per-channel rate

• 500 MT/s vs 400

– Mandated by fewer signal pins on V5

• 4 10/100/1000 Ethernet channels• No PowerPCs

– This version has not yet been released by Xilinx• When it is, we can use it

Page 7: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

BEE2 – BEE3 Differences (3)• Divided the system into two boards, Main and Control

– Main board has FPGAs, all high speed logic– Control board handles downloading, monitoring

• Being designed at BWRC– Simplifies main board engineering – can design control board in parallel– Initially, will have a simplified control board. System bring-up uses JTAG.

• Smaller main board– 211 vs 374 in2

– Fewer layers for lower cost• Much more “PC-like”• Fewer on-board peripheral interfaces

– Those that are there will work• Uses PC power supplies, peripherals• Schematic is complete, layout is in progress

– Fits in 2U enclosure– Much more attention is being given to thermal design– Must pass UL, FCC

Page 8: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

BEE3 Main Board

User15VLXT

User25VLXT

User35VLXT

User45VLXT

DDR2 DIMM0DDR2 DIMM1

DDR2 DIMM0DDR2 DIMM1

72*

72*

72*

72*

133 133

DDR2 DIMM2DDR2 DIMM3

133133

DDR2 DIMM2DDR2 DIMM3

40x2

DDR2 DIMM0DDR2 DIMM1

DDR2 DIMM0DDR2 DIMM1

133 133

DDR2 DIMM2DDR2 DIMM3

133133

DDR2 DIMM2DDR2 DIMM3

QSH-DP-040

40x2

40x2QSH-DP-

040QSH-DP-

040

PCI-E8X

CX4

CX4

CX4

CX4

CX4

CX4 PCI-E

8X PCI-E

8X

40x2QSH-DP-

040CX4

CX4

PCI-E8X

Page 9: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

BEE3 Board Layout

4 G

B D

DR

2-66

7 D

RA

M4

GB

DD

R2-

667

DR

AM

4 G

B D

DR

2-66

7 D

RA

M4

GB

DD

R2-

667

DR

AM

5VLXTFF1136

4 G

B D

DR

2-66

7 D

RA

M4

GB

DD

R2-

667

DR

AM

4 G

B D

DR

2-66

7 D

RA

M4

GB

DD

R2-

667

DR

AM

5VLXTFF1136

24 pin AT

X P

WR

Fujitsu 2x2 CX4

Fujitsu 2x2 CX4

PC

I-Express 8

x 50 p

in 2m

m H

ead

er1

2V8

-pin

PC

I-Express 8

x

PC

I-Express 8

x

PC

I-Express 8

x

4 GB

DD

R2-667 D

RA

M4 G

B D

DR

2-667 DR

AM

4 GB

DD

R2-667 D

RA

M4 G

B D

DR

2-667 DR

AM

5VLXTFF1136

4 GB

DD

R2-667 D

RA

M4 G

B D

DR

2-667 DR

AM

4 GB

DD

R2-667 D

RA

M4 G

B D

DR

2-667 DR

AM

5VLXTFF1136

1.0V

1.8V

1.0V

1.8V

2.5V

RJ45 RJ45

1.8V

1.8V

JTA

G

QS

H-D

P-04

0

QS

H-D

P-0

40Q

SH

-DP

-040

12V4-pin

305.00

380.00

20.00

30.00

25.00

105.00

25.00

105.00

15.00

70.00

40.00

QS

H-D

P-0

40

78.00

150.00

60.00

100.00

18.00

102.00

23.00

107.00

35.00

65.00

180.00

40.00

10.00

180.00

21.00

29.00

Page 10: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

BEE3 Package

Page 11: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

BEE3 Package Front View

Page 12: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

Bandwidths (per-FPGA)• Memory

– 500 MT/s * 9B/T * 2 channels: 9.0 GB/s

• Ring– 500 MT/s * 9 B/T * 2 channels: 9.0 GB/s

• QSH– 400 MT/s * 10 B/T: 4 GB/s

• Ethernet– 125 MB/s

• CX4– 1.25 GB/s * 2 directions * 2 channels: 5GB/s

• PCI Express– Same as CX4

Page 13: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

Initial Gateware• Mostly things required for production testing and

board characterization:• Signal connectivity checks

– Some are AC coupled, must test at speed– QSH tested with a crossover card

• Temperature and power supply monitoring• DDR-2 Controller

– Useful in other designs

• CX4 and PCI Express for at-speed tests• Xilinx and others have lots of IP

Page 14: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

Project Participants and Roles• Microsoft Research (Silicon Valley)

– Funds, manages system engineering, does some gateware• Celestica (Ottawa and Shanghai)

– Does main board engineering, prototype fabrication– Microsoft has a very deep relationship with Celestica

• TBD (Maybe Celestica, maybe ???)– Builds and delivers functioning systems

• Function Engineering (Palo Alto)– Does thermal and mechanical engineering

• Xilinx (San Jose)– Provides FPGAs for academic machines (slowest grade)– Provides FPGA application expertise

• Ramp Group (BWRC)– Control board, basic software

• Ramp Community– Uses the systems for research– Expanding to industrial users (e.g., us)

Page 15: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

Schedule

• Generate Specification – Done• Schematic Entry – Done• Board Layout – Started• Thermal modeling, heat sink design – Started• Chassis design -- Started• Signal Integrity – Imminent• Prototypes: Late Summer – Bring-up starts• Production: Start winter ‘07

Page 16: June 2007 RAMP Tutorial BEE3 Update Chuck Thacker John Davis Microsoft Research 10 June, 2007.

Why is Microsoft interested?• We believe the overall RAMP effort can have significant impact, and

want to support it in the most effective way we can.– Simply paying for grad students seems suboptimal

• We observe that universities aren’t very good at this sort of system engineering and production.– Grad students are great for many things, but doing things like board

layout aren’t among them.– Requires deep understanding of tools and production processes. Pros

have this.– We can open doors that academia can’t.– We have experience in managing this sort of program.

• We want the systems ourselves– As infrastructure for our new effort in computer architecture (yes, this is

a recruiting pitch).• We also want systems to be available to other industrial users

– This might be more difficult if the systems came from academia.– But we don’t want to be in the hardware business.