CHAPTER I INTRODUCTION 1.1 INTRODUCTION ABOUT THE PROJECT Nanotechnology provides smaller, faster, and lower energy devices which allow more powerful and compact circuitry; however, these benefits come with a cost—the nanoscale devices may be less reliable. Thermal- and shot-noise estimations alone suggest that the transient fault rate of an individual nanoscale device (e.g., transistor or nanowire) may be orders of magnitude higher than today’s devices. As a result, we can expect combinational logic to be susceptible to transient faults in addition to storage cells and communication channels. Therefore, the paradigm of protecting only memory cells and assuming the surrounding circuitries (i.e., encoder and decoder) will never introduce errors is no longer valid .In this paper, we introduce a fault-tolerant nanoscale memory architecture which tolerates transient faults both in the storage unit and in the 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
CHAPTER I
INTRODUCTION
1.1 INTRODUCTION ABOUT THE PROJECT
Nanotechnology provides smaller, faster, and lower energy devices which
allow more powerful and compact circuitry; however, these benefits come with
a cost—the nanoscale devices may be less reliable. Thermal- and shot-noise
estimations alone suggest that the transient fault rate of an individual nanoscale
device (e.g., transistor or nanowire) may be orders of magnitude higher than
today’s devices. As a result, we can expect combinational logic to be
susceptible to transient faults in addition to storage cells and communication
channels. Therefore, the paradigm of protecting only memory cells and
assuming the surrounding circuitries (i.e., encoder and decoder) will never
introduce errors is no longer valid .In this paper, we introduce a fault-tolerant
nanoscale memory architecture which tolerates transient faults both in the
storage unit and in the supporting logic (i.e., encoder, decoder (corrector), and
detector circuitries). Particularly, this involves identifying a class of error-
correcting codes (ECCs) that guarantees the existence of a simple fault-tolerant
detector design. This class satisfies a new, restricted definition for ECCs which
guarantees that the ECC codeword has an appropriate redundancy structure
such that it can detect multiple errors occurring in both the stored codeword in
memory and the surrounding circuitries. We call this type of error-correcting
codes, fault-secure detector capable ECCs (FSD-ECC). The parity-check
Matrix of an FSD-ECC has a particular structure that the decoder circuit,
generated from the parity-check Matrix, is Fault-Secure. The ECCs we identify
1
in this class are close to optimal in rate and distance, suggesting we can
achieve this property without sacrificing traditional ECC metrics. We use the
fault-secure detection unit to design a fault-tolerant encoder and corrector by
monitoring their outputs. If a detector detects an error in either of these units,
that unit must repeat the operation to generate the correct output vector. Using
this retry technique, we can correct potential transient errors in the encoder and
corrector outputs and provide a fully fault-tolerant memory system.
The novel contributions of this paper include the following:
1. a mathematical definition of ECCs which have simple FSD which
do not requiring the addition of further redundancies in order to
achieve the fault-secure property
2. identification and proof that an existing LDPC code (EGLDPC)
has the FSD property
3. a detailed ECC encoder, decoder, and corrector design that can be
built out of fault-prone circuits when protected by this fault-secure
detector also implemented in fault-prone
4. circuits and guarded with a simple OR gate built out of reliable
circuitry .
To further show the practical viability of these codes, work is done through the
engineering design of a nanoscale memory system based on these encoders and
decoders including the following:
memory banking strategies and scrubbing
reliability analysis
unified ECC scheme for both permanent memory bit
defects and transient upsets
2
This allows us to report the area, performance, and reliability achieved for
systems based on these encoders and decoders,
1.2 LITERETURE SURVEY
H. Naeimi and A. DeHon, “Fault secure encoder and decoder for memory applications,” in Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Syst., Sep. 2007,
Proposed the concept of a nanowire-based, sub lithographic memory
architecture tolerant to transient faults. Both the storage elements and the
supporting ECC encoder and corrector are implemented in dense, but
potentially unreliable, nanowire based technology. This compactness is made
possible by a recently introduced Fault-Secure detector design [18]. Using
Euclidean Geometry error-correcting codes (ECC), and identify particular
codes which correct up to 8 errors in data words, achieving a FIT rate at or
below one for the entire memory system for bit and nanowire transient failure
rates as high as 10−17 upsets/device/cycle with a total area below 1.7× the area
of the unprotected memory for memories as small as 0.1 Gbit. Scrubbing
designs are explored and this shows that the overhead for serial error
correction and periodic data scrubbing can be below 0.02% for fault rates as
high as 10−20 upsets/device/cycle. A design is presented to unify the error-
correction coding and circuitry used for permanent defect and transient fault
tolerance.
M. Davey and D.J.Mackay, “Low density parity check codes over Gf(q),”
improvements due to area saving from logic folding and parallel data
processing.
33
CHAPTER V
SYSTEM MODULES
5.1 FAULT TOLERANCE APPROACH
Fault tolerance technique is based on at least one of the three
types of redundancy: time, data, or hardware redundancy. Hardware
redundancy means the replication of hardware modules and some kind of result
comparison or voting instance. The inherent redundancy in field-
programmable logic resulting from the regular cell-based structure allows a
very efficient implementation of hardware redundancy. The faulty resource
must not be reused by the new configuration. After the reconfiguration, the
possible effect of the fault must be confined for some applications and the
circuit must be reset to a consistent state. Then the system can continue to
operate. The idea of an autonomous mechanism for fault detection and
reconfiguration at an appropriate speed, in terms of the regarded system, is the
starting point for the fault tolerance technique presented here. The technique
combines a scalable hardware-based fault detection mechanism with a fast
online fault reconfiguration technique and a check pointing and rollback
mechanism for fault recovery. The reconfiguration is based on a hardware-
34
implemented reconfiguration controller: the reconfiguration control unit
(RCU). In contrast to other online fault test and reconfiguration strategies as
described. The fault detection mechanism must provide the fault location and
trigger reconfiguration. The reconfiguration step must replace the current
configuration data set by an alternative configuration (which provides a fault-
avoiding mapping of the user circuit) and trigger recovery. The recovery step
must bring the whole system back into a consistent state. a fast online
technique, such differentiations are too time-consuming and a simpler
approach must be taken: all faults are assumed to be permanent. Even under
this assumption, no general technique is available today which controls the
appropriate reconfiguration procedure.
Fig.5.1: Phases of the fault tolerance technique.
The basic characteristics of fault tolerance require:
1. No single point of repair
2. Fault isolation to the failing component
3. Fault containment to prevent propagation of the failure
35
4. Availability of reversion modes
Fault-tolerant systems are typically based on the concept of redundancy.
5.2 NANOMEMORY ARCHITECTURE MODEL
The design structure of the encoder, corrector, and detector units of
our proposed fault-tolerant memory system. We also present the
implementation of these units on a sub-lithographic, nanowire-based substrate.
Before going into the design structure details we start with a brief overview of
the sub-lithographic memory architecture model.
Fig. 5.2. Structure of Nano Memory core
We use the Nano Memory and Nano PLA architectures to implement the
memory core and the supporting logic, respectively. Nano Memory and Nano
PLA are based on nanowire crossbars .The Nano Memory architecture
developed in can achieve greater than b/cm density even after including the
lithographic-scale address wires and defects. This design uses a nanowire
crossbar to store memory bits and a limited number of lithographic scale wires
for address and control lines. Fig.3 shows a schematic overview of this
memory structure. The nanowires can be uniquely selected through the two
address decoders located on the two sides of the memory core. Instead of using
36
a lithographic-scale interface to read and write into the memory core, we use a
nanowire-based interface. The reason that we can remove the lithographic-
scale interface is that all the blocks interfacing with the memory core (encoder,
corrector and detectors) are implemented with nanowire-based crossbars.
5.3 FAULT SECURE DETECTOR
The core of the detector operation is to generate the syndrome vector,
which is basically implementing the following vector-matrix multiplication on
the received encoded vector C and parity-check matrix H:
S=C.HT
Fig.5.3: Fault-secure detector for (15, 7, 5) EG-LDPC code
This binary sum is implemented with an XOR gate. Fig. 4 shows the
detector circuit for the (15, 7, 5)EG-LDPC code. Since the row weight of the
parity-check matrix is ρ , to generate one digit of the syndrome vector we need
a ρ -input XOR gate, or (ρ-1)2-input XOR gates. For the whole detector, it take
n(ρ-1) 2-input XOR gates. Table II illustrates
this quantity for some of the smaller EG-LDPC codes.
37
Hamming bound EG-LDPC Gilert Varshamov bound
(14,7,5) (15,7,5) (17,7,5)
(58,37,9) (63,37,9) (67,37,9)
(222,175,17) (255,175,17) (255,175,17)
TABLE 5.1 Detector, encoder, and corrector circuit area
An error is detected if any of the syndrome bits has a nonzero value. The
final error detection signal is implemented by an OR function of all the
syndrome bits. The output of this -input OR gate is the error detector signal
(see Fig. 4). In order to avoid a single point of failure, we must implement the
OR gate with a reliable substrate (e.g., in a system with sub-lithographic
nanowire substrate, the OR gate is implemented with reliable lithographic
technology—i.e., lithographic-scaled wire-OR).
5.4 ENCODER
An n-bit codeword c, which encodes a k-bit information vector is
generated by multiplying the k -bit information vector with k x n a bit
generator matrix G ; i.e., c=i .G.. EG-LDPC codes are not systematic and the
information bits must be decoded from the encoded vector, which is not
desirable for our fault-tolerant approach due to the further complication and
delay that it adds to the operation. these codes are cyclic codes 15. We used the
procedure to convert the cyclic generator matrices to systematic generator
matrices for all the EG-LDPC codes under consideration.
38
Fig. 5.4: Structure of an encoder circuit for the (15, 7, 5) EG-LDPC code
The above figure shows the encoder circuit to compute the parity bits of
the (15, 7, 5) EG-LDPC code. In this figure i=(i0,…….,i6) is the information
vector and will be copied to c0,…….c6 bits of the encoded vector, c , and the
rest of encoded vector ,the parity bits, are linear sums (XOR) of the
information bits. If the building block is two-input gates then the encoder
circuitry takes 22 two-input XOR gates. Table I shows the area of the encoder
circuits for each EG-LDPC codes under consideration based on their generator
matrices.
5.5 CORRECTOR
1) ONE-STEP MAJORITY-LOGIC CORRECTOR
One-step majority logic correction is the procedure that identifies the
correct value of a each bit in the codeword directly from the received
codeword; this is in contrast to the general message-passing error correction
39
strategy (e.g., [23]) which may demand multiple iterations of error diagnosis
and trial correction. Avoiding iteration makes the correction latency both small
and deterministic
This method consists of two parts:
1) Generating a specific set of linear sums of the received vector bits
2) Finding the majority value of the computed linear sums. linear sum of
the received
encoded vector bits can be formed by computing the inner product of
the received
vector and a row of a parity-check matrix. This sum is called Parity-
Check sum
2) MAJORITY CIRCUIT IMPLEMENTATION
Here we present a compact implementation for the majority gate using
Sorting Networks
5.6 BANKED MEMORY
Large memories are conventionally organized as sets of smaller memory
blocks called banks. The reason for breaking a large memory into smaller
banks is to trade off overall memory density for access speed and reliability.
Excessively small bank sizes will incur a large area overhead for memory
drivers and receivers. Large memory banks require long rows and columns
which results in high capacitance wires that consequently increases
the delay. Furthermore long wires are more susceptible to breaks and bridging
defects. Therefore excessively large memory banks have high defect rate and
low performance.
40
Fig.5.5. Banked memory organization, with single global corrector.
The number of faults that accumulate in the memory is directly related
to the scrubbing period. The longer the scrubbing period is, the larger the
number of errors that can accumulate in the system. However, scrubbing all
memory words serially can take a long time. If the time to serially scrub the
memory becomes noticeable compared to the scrubbing period, it can reduce
the system performance. To reduce the scrubbing time, we can potentially
scrub all the memory banks in parallel
CHAPTER VI
SYSTEM IMPLEMENTATION
41
6.1 PROCESS (Dynamic Reconfiguration)
The feasibility of run-time reconfiguration of FPGAs has been
established by a large number of case studies. However, these systems have
typically involved an ad hoc combination of hardware and software. The
software that manages the dynamic reconfiguration is typically specialized to
one application and one hardware configuration. We present three different
applications of dynamic reconfiguration, based on research activities at
Glasgow University, and extract a set of common requirements. We present the
design of an extensible run-time system for managing the dynamic
reconfiguration of FPGAs, motivated by these requirements. The system is
called RAGE, and incorporates operating-system style services that permit
sophisticated and high level operations on circuits.
ECC stands for "Error Correction Codes" and is a method used to detect
and correct errors introduced during storage or transmission of data. Certain
kinds of RAM chips inside a computer implement this technique to correct
data errors and are known as ECC Memory. ECC Memory chips are
predominantly used in servers rather than in client computers. Memory errors
are proportional to the amount of RAM in a computer as well as the duration of
operation. Since servers typically contain several Gigabytes of RAM and are in
operation 24 hours a day, the likelihood of errors cropping up in their memory
chips is comparatively high and hence they require ECC Memory.
Memory errors are of two types, namely hard and soft. Hard errors are
caused due to fabrication defects in the memory chip and cannot be corrected
once they start appearing. Soft errors on the other hand are caused
predominantly by electrical disturbances. Memory errors that are not corrected
immediately can eventually crash a computer. This again has more relevance
42
to a server than a client computer in an office or home environment. When a
client crashes, it normally does not affect other computers even when it is
connected to a network, but when a server crashes it brings the entire network
down with it. Hence ECC memory is mandatory for servers but optional for
clients unless they are used for mission critical applications.
ECC Memory chips mostly use Hamming Code or Triple Modular
Redundancy as the method of error detection and correction. These are known
as FEC codes or Forward Error Correction codes that manage error correction
on their own instead of going back and requesting the data source to resend the
original data. These codes can correct single bit errors occurring in data. Multi-
bit errors are very rare and hence due not pose much of a threat to memory
systems.
ENCODING PROCESS
EGLDPC codes have received tremendous attention in the coding
community because of their excellent error correction capability and near-
capacity performance. Some randomly constructed EGLDPC codes, measured
in Bit Error Rate (BER), come very close to the Shannon limit for the AWGN
channel (within 0.05 dB) with iterative decoding and very long block sizes (on
the order of 106 to 107). However, for many practical applications (e.g.
packet-based communication systems), shorter and variable block-size
EGLDPC codes with good Frame Error Rate (FER) performance are desired.
Communications in packet-based wireless networks usually involve a large
per-frame overhead including both the physical (PHY) layer and MAC layer
headers. As a result, the design for a reliable wireless link often faces a trade-
off between channel utilization (frame size) and error correction capability.
One solution is to use adaptive burst profiles in which, transmission parameters
43
relevant to modulation and coding may be assigned dynamically on a burst-by-
burst basis. Therefore, LDPC codes with variable block lengths and multiple
code rates for different quality-of service under various channel conditions are
highly desired.
FLOW OF ENCODING PROCESS
Fig 6.1:Flow of encoding process
In the recent literature, there are many EGLDPC decoder architectures
but few of them support variable block-size and muti-rate decoding. For
example, a 1 Gbps 1024-bit, rate 1/2 EGLDPC decoder has been implemented.
However this architecture just supports one particular EGLDPC code by
wiring the whole Tanner graph into hardware. A code rate programmable
EGLDPC decoder is proposed, but the code length is still fixed to 2048 bit for
simple VLSI implementation. In [3], a EGLDPC decoder that supports three
block sizes and four code rates is designed by storing 12 different parity check
matrices on-chip. As we can see, the main design challenge for supporting
variable block sizes and multiple code rates stems from the random or
unstructured nature of the EGLDPC codes. Generally support for different
44
block sizes of EGLDPC codes would require different hardware architectures.
To address this problem, we propose a generalized decoder architecture based
on the quasi-cyclic EGLDPC codes that can support a wider range of block
sizes and code rates at a low hardware requirement. To balance the
implementation complexity and the decoding throughput, a structured
EGLDPC code was proposed in recently for modern wireless communication
systems including but not limited to IEEE 802.16e and IEEE 802.11n. An
expansion factor. It divides the variable nodes and the check nodes into
clusters of size P such that if there exists an edge between variable and check
clusters, then it means P variable nodes connect to P check nodes via a
permutation (cyclic shift) network. Generally, support for different block sizes
and code rates implies usage of multiple PCMs. Storing all the PCMs onchip is
almost impractical and expensive. A good tradeoff between design complexity
and decoding throughput is partially parallel decoding by grouping a certain
number of variable and check nodes into a cluster for parallel processing.
Furthermore, the layered decoding algorithm can be applied to improve the
decoding convergence time by a factor of two and hence increases the
throughput. The structured EGLDPC code makes it effectively suitable for
efficient VLSI implementation by significantly simplifying the memory access
and message passing. The PCM can be viewed as a group of concatenated
horizontal layers, where the column weight is at most 1in each layer due to the
cyclic shift structure.
6.2 TESTING TECHNIQUES
In this project describe simple iterative decoders for low-density parity-
check codes based on Euclidean geometries, suitable for practical very-large-
scale-integration implementation in applications requiring very fast decoders.
45
The decoders are based on shuffled and replica-shuffled versions of iterative
bit-flipping (BF) and quantized weighted BF schemes. The proposed decoders
converge faster and provide better ultimate performance than standard BF
decoders. Here present simulations that illustrate the performance versus
complexity tradeoffs for these decoders. This project can show in some cases
through importance sampling that no significant error floor exists. Here novel
architectures comprising of one parallel and two semi-parallel decoder
architectures for popular PG-based LDPC codes.
These architectures have no memory clash and further are reconfigurable
for different lengths (and their corresponding rates). The architectures can be
configured either for the regular belief propagation based decoding or majority
logic decoding (MLD).In this paper, these analyze storage circuits constructed
from unreliable memory components. This project propose a memory
construction, using low-density parity-check codes, based on a construction
originally made by Taylor. The storage circuit consists of unreliable memory
cells along with a correcting circuit. The correcting circuit is also constructed
from unreliable logic gates along with a small number of perfect gates. The
modified construction enables the memory device to perform better than the
original construction. These present numerical results supporting our claims.
CHAPTER VII
PERFORMANCE AND LIMITATIONS
REED-SOLOMON APPLICATIONS
Modem Technologies xDSL, Cable modems CD, DVD Players
46
Digital Audio and Video Broadcast HDTV/Digital TV Data Storage and Retrieval Systems Hard-Disk Drives, CD-ROM Wireless Communications Cell Phones, Base Stations Wireless Enabled PDAs Digital Satellite Communication and Broadcast RAID Controllers with Fault-Tolerance
7.1 APPLICATIONS:
Used in SOC, NOC Processor
Used in Radios
Used almost in all electronic devices
Loopback BIST model for digital transceivers with limited test
circuitry
Spot-defects models (typical of CMOS technology) based on noise
and nonlinear analysis, using fault abstraction
7.2 MERITS OF SYSTEM
Reduces maintenance cost
High speed fault tolerance
Can easily identify faults
Process Capability
No external circuitry
Does not affect the internal Architecture of nano memory
Multiple faults can be easily solved
47
7.3 LIMITATIONS OF SYSTEM
Hardware faults cannot be recognized
Only pre designed regions can be checked
May negatively impact manufacturers current technology of silicon
chips
Only used in specific application
7.4 FUTURE ENHANCEMENT
With the advancement in science electrical and electronic devices
has reached unimaginable levels. The main constraint of any good device is,
it serves its purpose effectively BiST enables this efficiency. Future BiST t
system can be designed in such a way that hardware faults can also be
indicated so that it can be corrected. A multiprocessor system-on-chip is an
integrated system that performs real-time tasks at low power and for low
cost.
CHAPTER VIII
OUTPUT RESULTS AND DISCUSSIONS
ENCODER
48
DECODER
49
EXISTING METHOD’S RESULT
50
PAPER’S RESULT
PROPOSED METHOD’S RESULT
51
CHAPTER IX
CONCLUSION
This paper presents an algebraic method for constructing Modified E.G
low-density parity-check (LDPC) codes based on the structural properties of
Euclidean geometries. The construction method results in a class of M-EG-
LDPC codes. The key novel contribution of this paper is identifying and
defining a new class of error-correcting codes whose redundancy makes the
design of fault-secure detectors (FSD) particularly simple. We further quantify
the importance of protecting encoder and decoder circuitry against transient
errors, illustrating a scenario where the system failure rate (FIT) is dominated
by the failure rate of the encoder and decoder. We prove that Euclidean
geometry low-density parity-check (EG-LDPC) codes have the fault-secure
detector capability
52
CHAPTER X
REFERENCES
[1] Fault Secure Encoder and Decoder for memory Applications. Naeimi and
A. DeHon, in Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Syst.,
Sep.2007, pp. 409–417
[2] M. Davey and D.J.Mackay, “Low density parity check codes over Gf(q),”