J. M. Martins Ferreira - University of Porto (FEUP / DEEC) Tallinn Technical University :: May 5th 2009 1 / 32 Tallinn Technical University :: May 4th 2009 This presentation is available at http://www.slideshare.net/josemmf Tallinn Technical University :: May 5th 2009 This presentation is available at http://www.slideshare.net/josemmf On using BS to improve the reliability and availability of reconfigurable hardware J. M. Martins Ferreira [ [email protected] ] FEUP / DEEC - Rua Dr. Roberto Frias 4200-537 Porto - PORTUGAL M. G. Gericota, G. R. Alves, M. Silva, J. M. Ferreira, “Reliability and Avaliability in Reconfigurable Computing: A Basis for a Common Solution,” IEEE Transactions on VLSI Systems, Vol. 16, No. 11, pp. 1545-1558 , Nov. 2008.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 1 / 32
Tallinn Technical University :: May 4th 2009This presentation is available at http://www.slideshare.net/josemmf
Tallinn Technical University :: May 5th 2009This presentation is available at http://www.slideshare.net/josemmf
On using BS to improve thereliability and availability of reconfigurable hardware
M. G. Gericota, G. R. Alves, M. Silva, J. M. Ferreira, “Reliability and Avaliability in Reconfigurable Computing: A Basis for a Common Solution,” IEEE Transactions on VLSI Systems, Vol. 16, No. 11, pp. 1545-1558 , Nov. 2008.
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 2 / 32
Outline of this talk
1. Introduction
2. Concurrent replication of active CLBs
3. On-line structural concurrent test (better reliability)
4. Defragmentation (better availability)
5. Conclusion
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 3 / 32
• Motivation
• Causes of failure in FPGAs
Introduction
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 4 / 32
Motivation: An old problem becomes more important• Dynamically reconfigurable
FPGAs:– Production tests cannot
guarantee fault-free operation– Application areas include
mission-critical systems– The cost / benefit of spatial
redundancy is different from static implementations
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 5 / 32
Motivation: An old problem becomes more important
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 6 / 32
Causes of failure in FPGAs
• Post-production failure modes may be permanent or temporary ― examples:– Electromigration phenomena may lead to
permanent physical damage– Single-event upsets (SEUs) may cause
permanent malfunction if not mitigated (modification of SRAM contents changes design and data information)
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 7 / 32
• The principle
• How it works
• Resources required (time, space)
Concurrent replication of active CLBs
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 8 / 32
Concurrent replication of CLBs: The principle
functional blockin another area,(non-intrusively),and making theoriginal resourcesavailable for test
Rotation
Test
Relocation
D Q
Replication of functionality
D Q
Rotation of free resources
D Q
Resources under test
• The basic idea underlying release-to-test strategies consists of replicating a given
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 9 / 32
Concurrent replication of CLBs: The principle• Concurrent fault detection based on
release-to-test approaches must provide functional and state replication
• Replication at CLB-level – Facilitates state transfer and requires
a minimal amount of spare resources– The relative position of the replicated CLB and
its replica has an impact on propagation delay
CLB
IOB
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 10 / 32
Concurrent replication of CLBs: How it works• General replication principle – phase one:
– Copy the internal configuration of the replicated CLB into the replica CLB and place the inputs of both CLBs in parallel
replicated CLB
CLBreplica
In
In Out
Out
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 11 / 32
Concurrent replication of CLBs: How it works• General replication principle – phase two:
– Place the outputs of both CLBs in parallel (the replicated CLB may then be disconnected and made available for testing)
replicated CLB
CLBreplica
In
In Out
Outreplicated CLB
CLBreplica
In
In Out
Out
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 12 / 32
Concurrent replication of CLBs: Replication aid block• Supports state transfer in synchronous gated-
clock circuits
FF_OUT
CC D Q
D Q
CE
R
01
BY_C
Logic
D Q
CE
R
01
Logic
10
RESETCLK
CE
LOGIC_OUT
Replication aid block
Replica cell
Replicated cell
from the circuit
to the circuit
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 13 / 32
Replication flow:Time & space needed
Copy the internal logic functionality and place the
input signals in parallel
BY_C="1"CC="1"
CC="0"
Connect the clock enable inputs of both CLBs
Disconnect all the auxiliary relocation circuit signals
Place the CLB outputs in parallel
Disconnect the original CLB outputs
> 2 CLK pulseN
Y
>1CLK pulseN
Y
BY_C="0"
Disconnect the original CLB inputs
StepsNo. of bytes
Time (ms)
Copy the internal logic functionality and place the input signals in parallel
11 289 9,705
BY_C=1 & CC=1 441 0,379
CC=0 277 0,238
BY_C=0 277 0,238
Connect the clock enable inputs of both CLBs 2 145 1,844
Disconnect all the auxiliary relocation circuit signals
2 217 1,906
Place the CLB outputs in parallel 4 129 3,550
Disconnect the original CLB outputs 1 333 1,146
Disconnect the original CLB inputs 3 986 3,438
Total 26 094 22,444
1
2
3
4
5
6
7
8
9
1
2
3 4 5
6
7 8 9
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 14 / 32
• Fault model, test configurations
• Test application
• Rotation and release for test strategy
• Fault detection latency
On-line structural concurrent test
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 15 / 32
Fault model and test configurations• A hybrid fault model (stuck-at / functional)
was adopted and the two CLB slices (each with 13 inputs and 6 outputs) are tested in parallel Number of
configurationsNumber of
test vectorsNo. of bytes
Time (ms)
1st 16 18 392 15,813
2nd 16 3 115 2,678
3rd 2 623 0,536
4th 2 634 0,545
5th 2 613 0,527
6th 2 512 0,440
Total 40 23 889 20,539
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 16 / 32
Test application
• CLB testing via BS:– Test vector application
is done through a 13-bit user test data register
– Response capturing takes place through unused BS cells
MUX
Bypass registerInstruction register
Config. register
TDOTDI
...CLB
under test
CLB under test
CLB under test
IN OUT
User Test Register
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 17 / 32
Rotation strategy
• Vertical rotation has an advantage in the case of arithmetic circuits that use the dedicated carry interconnection between (vertically) adjacent CLBs
• In the general case, we should consider such factors as the number of circuits with high fanout and the shape / orientation of the implementation
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 18 / 32
Replicate and release-to-test in a 24-bit counter (example)
CIN
COUTCLB_R22C7.S0
BX
YB
CIN
COUTCLB_R21C7.S0
BX
YB
CIN
COUTCLB_R23C7.S0
BX
YB
CIN
COUTCLB_R24C7.S0
BX
YB
Dedicatedcarry lines
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 19 / 32
Replicate and release-to-test in a 24-bit counter (example)
0
20
40
60
80
100
120
140
160
0 1 2 3 4 5 6 7 8 9 10 11 12
Number of relocations
Max
imum
freq
uenc
y of
ope
ratio
n(M
Hz)
- verticalrotation
- horizontalrotation
CIN
COUTCLB_R22C7.S0
BX
YB
CIN
COUTCLB_R21C7.S0
BX
YB
CIN
COUTCLB_R23C7.S0
BX
YB
CIN
COUTCLB_R24C7.S0
BX
YB
U1/C6/C16/C1/O
U1/C6/C14/C1/O
Tbxcy
Tbyp
Tbyp
U1/C6/C12/C1/O
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 20 / 32
Rotation strategy: ITC’99 benchmarks
Circuit Logic Carry logic
Reference # PI # PO # gates # FF Lines Segments
B01 2+2 2 47 5 0 0
B02 1+2 1 29 4 0 0
B03 4+2 4 150 30 0 0
B04 11+2 8 606 66 4 14
B05 1+2 36 977 34 4 16
B06 2+2 6 61 9 0 0
B07 1+2 8 422 49 2 6
B08 9+2 4 168 21 0 0
B09 1+2 1 160 28 0 0
B10 11+2 6 190 17 0 0
B11 7+2 6 484 31 1 4
B12 5+2 6 1037 121 0 0
B13 10+2 10 343 53 1 4
B14 32+2 54 4787 245 11 150
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 21 / 32
Rotation strategy: ∆f and size for the ITC’99 circuits
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 27 / 32
Fragmentation: Why?
• The absence of faults does not guarantee acceptable availability, namely when function swapping /partial reconfiguration occurs frequently
• Insufficient contiguous resources will delay incoming functions
nth partial reconfig.
2nd partial reconfig.
1st partial reconfig.
Initial config.
Resource allocation(2-D spatial)
Time
y
x
Reconfigurations (temporal dimension)
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 28 / 32
Can concurrent replication help?• Concurrent replication of active CLBs may
be used to defragment the FPGA and minimise the implementation delay to incoming functions– Defragmentation is performed concurrently with
all running functions (no need to halt their execution)
– Coherency of the register contents is guaranteed, preserving all state information
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 29 / 32
• Summary
• Research topics
Conclusion
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 30 / 32
Summary
• Concurrent replication offers a powerful and non-intrusive solution to improve reliability and availability of reconfigurable hardware
• Paralleling CLB inputs and outputs doesn’t create any problem
• Boundary-scan provides a valuable contribution to implement an on-line concurrent structural test strategy
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 31 / 32
Research topics
• Concurrent replication of active CLBs offers a powerful tool for defragmentation purposes, but the higher-level strategy is still missing
• All aspects of the proposed solutions were validated in practice (lab experimentation), but a software tool to fully automate the reconfiguration process is still missing
J. M. Martins Ferreira - University of Porto (FEUP / DEEC)Tallinn Technical University :: May 5th 2009 32 / 32
Tallinn Technical University :: May 4th 2009This presentation is available at http://www.slideshare.net/josemmf
Tallinn Technical University :: May 5th 2009This presentation is available at http://www.slideshare.net/josemmf
On using BS to improve thereliability and availability of reconfigurable hardware