Top Banner
FPGA Defect Tolerance: Impact of Granularity Anthony Yu Anthony Yu Guy Lemieux Guy Lemieux December 14, 2005 December 14, 2005
35

FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

FPGA Defect Tolerance: Impact of

GranularityAnthony YuAnthony Yu Guy LemieuxGuy Lemieux

December 14, 2005December 14, 2005

Page 2: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

2Field-Programmable Technology (FPT) '05

OutlineOutline

Introduction and motivationIntroduction and motivation Previous worksPrevious works New architecturesNew architectures

Coarse-grain redundancy (CGR)Coarse-grain redundancy (CGR) Fine-grain redundancy (FGR)Fine-grain redundancy (FGR)

Experimentation ResultsExperimentation Results ConclusionsConclusions

Page 3: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

3Field-Programmable Technology (FPT) '05

Introduction and Introduction and MotivationMotivation

Scaling introduces Scaling introduces new new typestypes of defectsof defects

Smaller feature sizes Smaller feature sizes susceptible to susceptible to smaller smaller defectsdefects

Expected resultsExpected results Defects per chip increasesDefects per chip increases Chip yield declinesChip yield declines

FPGAs are mostly FPGAs are mostly interconnectinterconnect

FPGAs must tolerate FPGAs must tolerate multiple interconnect multiple interconnect defectsdefects to improve yield to improve yield (and $$$)(and $$$)

Page 4: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

4Field-Programmable Technology (FPT) '05

General Defect Tolerant General Defect Tolerant TechniquesTechniques

Defect-tolerant techniques minimize Defect-tolerant techniques minimize impact (cost) of manufacturing defectsimpact (cost) of manufacturing defects

FPGA defect-tolerance can be loosely FPGA defect-tolerance can be loosely categorized into three classes:categorized into three classes: Software Redundancy – use CAD tools to map Software Redundancy – use CAD tools to map

around the defectsaround the defects Hardware Redundancy – incorporate spare Hardware Redundancy – incorporate spare

resources to assist in defect correction (eg. resources to assist in defect correction (eg. Spare row/column)Spare row/column)

Run-time Redundancy – protection against Run-time Redundancy – protection against transient faults such as SEUs (eg. TMR)transient faults such as SEUs (eg. TMR)

Page 5: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

5Field-Programmable Technology (FPT) '05

Previous work – 1 – XilinxPrevious work – 1 – Xilinx Xilinx’s Defect-Tolerant ApproachXilinx’s Defect-Tolerant Approach

Customer (knowingly) purchases “less that perfect” Customer (knowingly) purchases “less that perfect” partsparts

Customer gives Xilinx configuration bitstreamCustomer gives Xilinx configuration bitstream Xilinx tests FPGA devices against bitstreamXilinx tests FPGA devices against bitstream

Sells FPGA parts that “appear” perfectSells FPGA parts that “appear” perfect Defects avoid the bitstreamDefects avoid the bitstream

Limitation:Limitation: Chips work only with given bitstream – no changes!Chips work only with given bitstream – no changes!

Page 6: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

6Field-Programmable Technology (FPT) '05

Previous work – 2 – Previous work – 2 – AlteraAltera

Altera’s Defect-Tolerant ApproachAltera’s Defect-Tolerant Approach Customer purchases “seemingly perfect” partsCustomer purchases “seemingly perfect” parts

Make defective resources inaccessible to Make defective resources inaccessible to useruser

Coarse-grain architectureCoarse-grain architecture Spare row and column in array (like memories)Spare row and column in array (like memories)

Defective row/column must be bypassedDefective row/column must be bypassed Use the spare row/column insteadUse the spare row/column instead

Limitation:Limitation: Does not scale well (multiple defects)Does not scale well (multiple defects)

Page 7: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

7Field-Programmable Technology (FPT) '05

ObjectiveObjective ProblemProblem

FPGA yield is on decline because of aggressive FPGA yield is on decline because of aggressive technology scalingtechnology scaling

Proposed SolutionsProposed Solutions Defect-tolerance through redundancyDefect-tolerance through redundancy

Important ObjectivesImportant Objectives Interconnect defects important (dominates area)Interconnect defects important (dominates area) Tolerate multiple defects (future trend)Tolerate multiple defects (future trend) Preserve timing (no timing re-verification)Preserve timing (no timing re-verification) Fast correction time (production use)Fast correction time (production use) Understand the factors that influence yieldUnderstand the factors that influence yield

Page 8: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

BackgroundBackground

Page 9: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

9Field-Programmable Technology (FPT) '05

Island-style FPGAIsland-style FPGA

Page 10: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

10Field-Programmable Technology (FPT) '05

Directional Switch BlockDirectional Switch Block

Page 11: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

11Field-Programmable Technology (FPT) '05

Directional Switch BlockDirectional Switch Block

Page 12: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

Course-grain Course-grain Redundancy Redundancy

(CGR)(CGR)

Page 13: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

13Field-Programmable Technology (FPT) '05

Coarse-grain Coarse-grain Redundancy (CGR)Redundancy (CGR)

Row

Dec

oder

Fault Free

Spare Row

Wire Extensions

Faulty

Defect

Row

Dec

oder

BypassedRow

F. Hatori et al., “Introducing Redundancy in Field Programmable GateArrays,” presented at Custom Integrated Circuits Conference, 1993.

Page 14: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

14Field-Programmable Technology (FPT) '05

So…what’s wrong with it?So…what’s wrong with it?

Spare Row and Column

0

0.2

0.4

0.6

0.8

1

1.2

1 10

Number of Defects

Yie

ld

32x32

64x64

128x128

256x256

Page 15: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

15Field-Programmable Technology (FPT) '05

Improving yield for CGR –Improving yield for CGR –Adding Adding Multiple GlobalMultiple Global

SparesSpares Add multiple Add multiple

globalglobal spare to spare to traditional CGRtraditional CGR

Global spares can Global spares can be used to repair be used to repair any defective any defective row/column in the row/column in the arrayarray

Wire extensions Wire extensions are now longerare now longer

Page 16: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

16Field-Programmable Technology (FPT) '05

Yield Impact of Multiple Global Yield Impact of Multiple Global SparesSpares

Global Spare Rows+Columns (32x32)

0

0.2

0.4

0.6

0.8

1

1.2

1 10Number of Defects

Yield

Baseline2 Global4 Global

Page 17: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

17Field-Programmable Technology (FPT) '05

Increasing Area+Delay Increasing Area+Delay OverheadOverhead

1 GLOBAL SPARE

2 GLOBAL SPARES

4 GLOBAL SPARES MAY BE IMPRACTICAL

!!!

NO SPARES

MORE SPARES MORE MUX OVERHEAD IN EVERY SWITCH

ELEMENT

Page 18: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

18Field-Programmable Technology (FPT) '05

Improving yield for CGR –Improving yield for CGR –Adding Adding Multiple LocalMultiple Local

SparesSpares Divide FPGA into Divide FPGA into

subdivisionssubdivisions

Each subdivision has Each subdivision has locallocal spare(s)spare(s)

DistributesDistributes spares across spares across chipchip Reduces mux area overheadReduces mux area overhead

(of Global scheme)(of Global scheme)

Limitation:Limitation: Spare(s) can only repair defect Spare(s) can only repair defect

withinwithin the subdivision the subdivision

Page 19: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

19Field-Programmable Technology (FPT) '05

Yield Impact of Multiple Local Yield Impact of Multiple Local SparesSpares

(not as good as Global with same # (not as good as Global with same # spares)spares)

Local Spare Rows+Columns (32x32)

0

0.2

0.4

0.6

0.8

1

1.2

1 10Number of Defects

Yield

Baseline2 Global4 Global2 Sub, 1 Spare4 Sub, 1 Spare

Page 20: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

Fine-grain Fine-grain Redundancy Redundancy

(FGR)(FGR)

Page 21: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

21Field-Programmable Technology (FPT) '05

Fine-grain Redundancy Fine-grain Redundancy (FGR) – Defect Avoidance (FGR) – Defect Avoidance

by Shiftingby ShiftingDefectSpare

a) Original b) Corrected

+1

+1 -1-1

-1+1

Page 22: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

22Field-Programmable Technology (FPT) '05

Defect-tolerant Switch Defect-tolerant Switch BlockBlock

-1 0-2

+1 0+2

-10

-2

+10

+2

-1 0-2

+1 0+2

-10

-2

+10

+2

omux

imux

a) Original b) Defect-tolerant

Page 23: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

23Field-Programmable Technology (FPT) '05

Switch Implementation Switch Implementation OptionsOptions

• Several detailed implementations are possible• Trade off area / delay / yield(repairability)

Page 24: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

24Field-Programmable Technology (FPT) '05

Minimum Fault-free Radius Minimum Fault-free Radius (MFFR)(MFFR)

Page 25: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

25Field-Programmable Technology (FPT) '05

Experimentation ResultsExperimentation Results

Switch implementationSwitch implementation Array sizeArray size Wire lengthWire length AreaArea SummarySummary

Page 26: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

26Field-Programmable Technology (FPT) '05

Switch ImplementationSwitch Implementation

* Assumes all bridging defects

Page 27: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

27Field-Programmable Technology (FPT) '05

Fixed Array Size (32x32) – Fixed Array Size (32x32) – Global SparingGlobal Sparing

Page 28: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

28Field-Programmable Technology (FPT) '05

Fixed Array Size (32x32) – Fixed Array Size (32x32) – Local SparingLocal Sparing

Page 29: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

29Field-Programmable Technology (FPT) '05

Increasing Array SizeIncreasing Array Size

Page 30: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

30Field-Programmable Technology (FPT) '05

Yield for Varying Wire Yield for Varying Wire LengthLength

Page 31: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

31Field-Programmable Technology (FPT) '05

Estimated Area overhead at Estimated Area overhead at equal yield (80%)equal yield (80%)

* CGR-G1 can only tolerate 1-2 defects

Page 32: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

32Field-Programmable Technology (FPT) '05

Limitations of Study & Limitations of Study & ArchitecturesArchitectures

Logic and power/ground shorts were Logic and power/ground shorts were not considerednot considered

Assumed that all defects are randomly Assumed that all defects are randomly distributeddistributed

Assumed that all defects can be Assumed that all defects can be corrected with a single row/columncorrected with a single row/column

Switch area was not accounted for our Switch area was not accounted for our yield modelyield model

Area results for CGR are Area results for CGR are approximatedapproximated

Page 33: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

33Field-Programmable Technology (FPT) '05

ConclusionsConclusions

CGR CGR is effective for 1 or 2 defects effective for 1 or 2 defects FGR meets desired objectives:FGR meets desired objectives:

Tolerates Tolerates multiplemultiple randomly randomly distributed defectsdistributed defects

Defect correction Defect correction does not perturb does not perturb timingtiming

Tolerates an Tolerates an increasing numberincreasing number of of defects as array size increasesdefects as array size increases

Correction can be applied Correction can be applied quicklyquickly

Page 34: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

Thank you!Thank you!

[email protected]@ece.ubc.ca

Page 35: FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.

35Field-Programmable Technology (FPT) '05

SummarySummary As the density of FPGAs increase, they becoming As the density of FPGAs increase, they becoming

in susceptible to manufacturing defectsin susceptible to manufacturing defects Fault-redundant techniques alleviate this Fault-redundant techniques alleviate this

growing problemgrowing problem Depending on the desired level of protection, we Depending on the desired level of protection, we

can apply different techniquescan apply different techniques At low defect rates, the spare row and column At low defect rates, the spare row and column

approach has lower overhead than the fine-grain approach has lower overhead than the fine-grain approachapproach

At large array sizes, the spare row and column At large array sizes, the spare row and column approach requires more area overhead to approach requires more area overhead to tolerate the same number of defects as the fine-tolerate the same number of defects as the fine-grain approachgrain approach