8/10/2019 VD_chap_12
1/70
Digital Integrated
CircuitsA Design Perspective
Designing Memory
And Array
Structures
Jan M. RabaeyAnantha Chandrakasan
Borivoje Nikolic
8/10/2019 VD_chap_12
2/70
Chapter Overview
Memory Classification
Memory Architectures
The Memory CorePeriphery
Reliability
Case Studies
8/10/2019 VD_chap_12
3/70
Memory Classification
Type of memory unit that is preferable for
a given application is a function of the
required memory size, the time it takes to
access the stored data, the access patterns,the application and the system
requirements.
8/10/2019 VD_chap_12
4/70
Size
Different means are used to express the sizeof a memory unit.
The circuit designer tends to define the size
of a memory in terms of bits. The chip designer expresses the memory size
in bytes or its multiples-kilobytes,megabytes, gigabytes and ultimately
terabytes. The system designer likes to quote the
storage requirement in terms of words, whichrepresent a basic computational entity.
8/10/2019 VD_chap_12
5/70
Timing Parameters
Time it takes to read from the memoryis called the read-access time which is
equal to the delay between the read
request and the moment the data isavailable at the output.
8/10/2019 VD_chap_12
6/70
Function
Often classified on the basis ofmemory functionality, access patterns
and the nature of the storage
mechanism.A distinction is made between read
only(ROM) and read write(RWM)
memories. RWM belong to the class volatile
memory
ROMNon volatile memory
8/10/2019 VD_chap_12
7/70
Access Pattern
Based on the order in which data canbe accessed
Random access class which means
memory can be read or written in arandom order.
Some memory types restrict the orderof access, which results in either
faster access times, smaller area or amemory with special functionality.Examples such as FIFO, LIFO and
shift register.
8/10/2019 VD_chap_12
8/70
Memory Classification
Read-Write MemoryNon-Volatile
Read-Write
Memory
Read-Only Memory
EPROM
E2PROM
FLASH
RandomAccess
Non-RandomAccess
SRAM
DRAM
Mask-Programmed
Programmable (PROM)
FIFO
Shift Register
CAM
LIFO
8/10/2019 VD_chap_12
9/70
8/10/2019 VD_chap_12
10/70
Memory Architectures and
Building blocks When implementing an
N-word memory whereeach word is M bit
wide. While this approach is
relatively simple andworks for very small
memories, one runsinto a number ofproblems when tryingto use it for largermemories.
8/10/2019 VD_chap_12
11/70
Memory Architectures and
Building blocksAssume that we would like to
implement a memory that holds 1
million(N=10^6) 8-bit words.
The actual number of words equal2^20=1024*1024=1,048,576.
A decoder is inserted to reduce the
number of select signals.
8/10/2019 VD_chap_12
12/70
Architecture for N word Memory
Memory word isselected by providinga binary encoded
address word. Decoder translates
this address into
N=2^k select lines,only one of which isactive at a time.
8/10/2019 VD_chap_12
13/70
Architecture for N word Memory
While this resolves the selectionproblem, it does not address the issue
of the memory aspect ratio.
Evaluation of the dimensions of thestorage array of our example shows
that its height is approximately
128,000 times larger than itswidth.(the shape of the basic storage
cell is approximately square which is
almost always the case)
8/10/2019 VD_chap_12
14/70
Array-Structured Memory
ArchitectureProblem: ASPECT RATIO or HEIGHT >> WIDTH
Amplify swing torail-to-rail amplitude
Selects appropriateword
8/10/2019 VD_chap_12
15/70
Array-Structured Memory
Architecture Multiple words are stored in a single row
and are selected simultaneously. To route the correct word to the
input/output terminals, an extra piece of
circuitry called the column decoder isneeded.
The address word is partitioned into acolumn address and a row address.
The row address enables one row of thememory for R/W, while the columnaddress picks one particular word fromthe selected row.
8/10/2019 VD_chap_12
16/70
Hierarchical Memory Architecture
Advantages:
1. Shorter wires within blocks2. Block address activates only 1 block => power savings
8/10/2019 VD_chap_12
17/70
Hierarchical Memory
ArchitectureAs an example, a 4 Mbit SRAM can
be designed as a composition of
______ blocks each of which contains
128Kbits. Number of rows
Number of columns
8/10/2019 VD_chap_12
18/70
8/10/2019 VD_chap_12
19/70
Content Addressable Memory
Supports three modes of operation:read, write and match
Comparand block is filled with the data
pattern to match Mask indicates which bits are
significant.
Every row that matches the pattern ispassed to the validity block.
Only valid rows that match are passedto the priority encoder.
8/10/2019 VD_chap_12
20/70
Read-Only Memory
The fact that the contents of a ROMcell are permanently fixed
considerably simplifies its design.
The cell should be designed so that a0 or 1 is presented to the bit line upon
activation of its word line.
8/10/2019 VD_chap_12
21/70
Read-Only Memory Cells
WL
BL
WL
BL
1WL
BL
WL
BL
WL
BL
0
VDD
WL
BL
GND
Diode ROM MOS ROM 1 MOS ROM 2
8/10/2019 VD_chap_12
22/70
A 4 x 4 OR ROM cell array
WL [0]
VDD
BL [0]
WL [1]
WL [2]
WL [3]
Vbias
BL [1]
Pull-down loads
BL [2] BL [3]
VDD
8/10/2019 VD_chap_12
23/70
MOS NOR ROM
WL [0]
GND
BL [0]
WL [1]
WL [2]
WL [3]
VDD
BL [1]
Pull-up devices
BL [2] BL [3]
GND
8/10/2019 VD_chap_12
24/70
MOS NAND ROM
All word lines high by default with exception of selected row
WL [0]
WL [1]
WL [2]
WL [3]
VDDPull-up devices
BL [3]BL [2]BL [1]BL [0]
8/10/2019 VD_chap_12
25/70
Non Volatile Read Write
Memories The memory core consists of an array
of transistors placed on a word line or
bit line grid.
The memory is programmed byselectively disabling or enabling some
of those devices.
A modified transistor that permits itsthreshold to be altered electrically is
used.
8/10/2019 VD_chap_12
26/70
Non Volatile Read Write
Memories To reprogram the memory, the
programmed values must be erased,
after which a new programming round
can be started. The method of erasing is the main
differentiating factor between the
various classes of reprogrammablenonvolatile memories.
8/10/2019 VD_chap_12
27/70
Floating-gate transistor
(FAMOS) Structure is similar to a traditional
MOS device, except that an extra
polysilicon strip is inserted between
the gate and channel. This strip is notconnected to anything and is called a
floating gate.
Double the gate oxide thickness,which results in a reduced device
transconductance as well as
increased threshold voltage.
N V l til M i
8/10/2019 VD_chap_12
28/70
Non-Volatile Memories
The Floating-gate transistor
(FAMOS)
Floating gate
Source
Substrate
Gate
Drain
n+ n+_p
tox
tox
Device cross-section Schematic symbol
G
S
D
8/10/2019 VD_chap_12
29/70
Floating-gate transistor
(FAMOS) Important property:- threshold voltage is
programmable
Applying a high voltage between thesource and gate drain terminals createsa high electric field and causesavalanche injection to occur. Electronsacquire sufficient energy and they gettrapped on the floating gate.
This phenomenon can occur with oxidesas thick as 100 nm, which makes itrelatively easy to fabricate the device.
8/10/2019 VD_chap_12
30/70
Erasable Programmable Read
Only Memory(EPROM) Erased by shining ultraviolet light on
the cells through a transparent window
in a package
Erasure process is slow and can takefrom seconds to several minutes,
depending on the intensity of the UV
sourceAnother problem-limited endurance-
the no. of erase/program cycles is
generally limited to a max of one
8/10/2019 VD_chap_12
31/70
Erasable Programmable Read
Only Memory(EPROM) Device threshold might vary with
repeated programming cycles.
Most EPROM therefore contain on
chip circuitry to control the value of thethreshold to within a specified range
during programming.
EPROM cell is extremely simple,making it possible to fabricate large
memories at a low cost.
8/10/2019 VD_chap_12
32/70
8/10/2019 VD_chap_12
33/70
FLOTOX EEPROM
Floating gate
Source
Substratep
Gate
Drain
n1 n1
FLOTOX transistor Fowler-NordheimI-V characteristic
2030 nm
10 nm
-10 V
10 V
I
VGD
8/10/2019 VD_chap_12
34/70
Flash Electrically Erasable
Programmable Read Only
Memory (Flash) Combination of EPROM and
EEPROM approaches.
Most Flash EEPROM devices use the
avalanche hot electron injection
approach to program the devices.
Erasure is performed using Fowler-Nordheim tunneling, as for EEPROM
cells.
8/10/2019 VD_chap_12
35/70
Flash Electrically Erasable
Programmable Read Only
Memory (Flash) The main difference is that erasure is
performed in bulk for the complete
chip or for a sub section of thememory.
Erasing the complete memory core at
once makes it possible to carefullymonitor of the device characteristics
during erasure.
8/10/2019 VD_chap_12
36/70
Flash Electrically Erasable
Programmable Read Only
Memory (Flash) The monitoring control hardware on
the memory chip regularly checks the
value of the threshold during erasureand adjusts the erasure time
dynamically.
This approach is only practical whenerasing large chunks of memory at a
time; hence the flash concept.
8/10/2019 VD_chap_12
37/70
Periphery
DecodersSense Amplifiers
Input/Output BuffersControl / Timing Circuitry
8/10/2019 VD_chap_12
38/70
Address Decoders
Address decoders have a substantialimpact on the speed and power
consumption of the memory.
Two classes of decodersRow decoders-whose task is to
enable one memory row out of 2^M
Column and block decoders
8/10/2019 VD_chap_12
39/70
Row Decoders
Collection of 2Mcomplex logic gates
Consider 10 bit address decoder
The rows with addresses 0 and 511 are enabled by the following logic funct
(N)AND Decoder
NOR Decoder
For a single stage implementation, it can be transformed into a wide
NOR
8/10/2019 VD_chap_12
40/70
Static Decoder design
Implementing a wide NOR function inCMOS is impractical.
Possible solution- pseudo NMOS
design style Splitting a complex gate into two or
more layers most often produces both
a faster and cheaper implementation. Segments of the address are decoded
in a first layer called the pre-decoder.
8/10/2019 VD_chap_12
41/70
Hierarchical Decoders
A2A2
A2A3
WL 0
A2A3A2A3A2A3
A3 A3A 0A0
A0A1A0A1A0A1A0A1
A1 A1
WL 1
Multi-stage implementation improves performance
NAND decoder using
2-input pre-decoders
8/10/2019 VD_chap_12
42/70
Static Decoder
Advantages of using pre-decoder: It reduces the number of transistors
required.
As the number of inputs to the NANDgates is halved, the propagation delay
is reduced.
8/10/2019 VD_chap_12
43/70
Dynamic Decoders
Precharge devices
VDD f
GND
WL3
WL2
WL1
WL0
A0A0
GND
A1A1f
WL3
A0A0 A1A1
WL 2
WL 1
WL 0
VDD
VDD
VDD
VDD
2-input NOR decoder 2-input NAND decoder
8/10/2019 VD_chap_12
44/70
Column and block decoders
Column decoders should match the bit line pitchof the memory array.
The functionality of a column and block decoderis best described as a 2^K input multiplexers
where K stands for the size of the address word.
For read-write arrays, these multiplexers can be
either separate or shared between read and write
operations.
8/10/2019 VD_chap_12
45/70
4-input pass-transistor based column
decoder
Advantages: speed
Only one extra transistor in signal path
Disadvantage: Large transistor count
2-input NOR
decoder
A 0S0
BL 0 BL 1 BL 2 BL 3
A 1
S1
S2
S3
D
Number of transistors for 2k
input decoder= (k+1) 2k+2k
8/10/2019 VD_chap_12
46/70
4-to-1 tree based column decoder
Number of devices drastically reducedDelay increases quadratically with # of sections; prohibitive for large decoders
BL 0 BL 1 BL 2 BL 3
D
A 0
A 0
A1
A 1
Number of transistors for 2^k input decoder= 2k+2k-1+...+4+2=2*(2k-1)
8/10/2019 VD_chap_12
47/70
Read-Write Memories (RAM)
STATIC (SRAM)
DYNAMIC (DRAM)
Data stored as long as supply is applied
Large (6 transistors/cell)
Fast
Periodic refresh required
Small (1-3 transistors/cell)
Slower
St ti R d A
8/10/2019 VD_chap_12
48/70
Static Random Access
Memory Quite Similar to SR latch It requires six transistors per bit
Access to the cell is enabled by word line, which
replaces the clock and controls the two transistors M5and M6, shared between the read and write operation.
Value is stored symmetricallyboth true and
complement are stored on cross-coupled transistors.
SRAM retains value as long as power is applied to
circuit.
8/10/2019 VD_chap_12
49/70
6-transistor CMOS SRAM Cell
WL
BL
VDD
M5M6
M4
M1
M2
M3
BL
8/10/2019 VD_chap_12
50/70
CMOS SRAM Analysis (Read)WL
BL
VDD
M 5
M 6
M 4
M1V
DDV
DDV
DD
BL
Q = 1Q = 0
Cbit Cbit
Assume that 1 is stored at Q and both bitlines are precharged to 2.5V.
Read Cycle is started by asserting the word
line, enabling both pass transistors M5 and
M6.
8/10/2019 VD_chap_12
51/70
CMOS SRAM Analysis (Write)
BL = 1 BL = 0
Q = 0
Q = 1
M1
M4
M5
M6
VDD
VDD
WL
Assume that a 1 is stored in the cell(or Q=1)
0 is written in the cell by setting BL to 1 and BL to 0
The gates of transistors M1 and M4 stay at Vdd and
GND
8/10/2019 VD_chap_12
52/70
8/10/2019 VD_chap_12
53/70
Performance of SRAM
For SRAM cell, read operation is critical onebecause it requires the discharging of the large bit
line capacitance through the stack of two small
transistors of the selected cell, CBL has to be
discharged through the series combination of M5and M1.
To accelerate the read time, SRAMs use sense
amplifiers. It quickly discharges one of the bit lines.
8/10/2019 VD_chap_12
54/70
Improved MOS SRAM cells
Resistive load SRAM cell. It is alsoknown as the four transistor SRAM cell.
Cross coupled CMOS inverter pair is
replaced by a pair of resistive load NMOSinverters.
This reduces the SRAM cell size by
approximately one third.
8/10/2019 VD_chap_12
55/70
Resistance-load SRAM Cell
M3
RL RL
VDD
WL
Q Q
M1 M2
M4
BL BL
8/10/2019 VD_chap_12
56/70
8/10/2019 VD_chap_12
57/70
Dynamic RAM
Dynamic random access memory (DRAM), a transistorand a capacitor are paired to create a memory cell,
which represents a single bit of data.
For dynamic memory to work, either the CPU or thememory controller has to come along and recharge all of
the capacitors holding a 1 before they discharge. To do
this, the memory controller reads the memory and then
writes it right back. This refresh operation happensautomatically.
3 T i t DRAM C ll
8/10/2019 VD_chap_12
58/70
3-Transistor DRAM Cell
The cell is written to by placing the appropriate data value on BL1 and
asserting the write word line .
The data is retained as charge on capacitance C once WWL is lowered.
When reading the cell, Read word line is raised. The storage transistor
M2 is either on or off depending upon the stored value.
3T DRAM L t
8/10/2019 VD_chap_12
59/70
3T-DRAM Layout
BL2 BL1 GND
RWL
WWL
M3
M2
M1
8/10/2019 VD_chap_12
60/70
Dynamic RAM
Further Simplification in the cell structure are possible atthe expense of a more complex circuit operation.
Bit lines BL1 and BL2 can be merged into single wire
This requires the bit line to be driven both values in a
single cycle.
Another option is to merge the RWL and WWL lines.
1 T i DRAM C ll
8/10/2019 VD_chap_12
61/70
1-Transistor DRAM Cell
Write: C S is charged or discharged by asserting WL and BL.Read: Charge redistribution takes places between bit line and storage capacitance
During a write cycle, the data value is placed on the bit line BL, and the word
line WL is raised.
Before a read operation is performed, the bit line is precharged to a voltage
Vpre.
Upon asserting the word line, a charge distribution takes place.This results in a voltage change on the bit line, direction of which determines
8/10/2019 VD_chap_12
62/70
Programmable logic array
8/10/2019 VD_chap_12
63/70
Programmable logic array
(PLA) Used to implement specialized logic
functions.
A PLA decodes only some addresses
(input values); a ROM decodes alladdresses.
PLA is used for some logic functions.
8/10/2019 VD_chap_12
64/70
PLA structure
AND plane, OR plane, inverters together formcomplete two-level logic functions.
Both AND and OR planes are implemented as
NOR circuits.
Pull-down transistors form programming of
PLA. Transistors may be referred to as
programming tabs.
Programmable Logic Array
8/10/2019 VD_chap_12
65/70
Programmable Logic Array
GND GND GND GND
GND
GND
GND
VDD
VDD
X0X0 X1 f0 f1X1 X2X2
AND-plane OR-plane
Pseudo-NMOS PLA
Dynamic PLA
8/10/2019 VD_chap_12
66/70
Dynamic PLA
GND
GNDVDD
VDD
X0X0 X1 f0 f1X1 X2X2
ANDf
ANDf
ORf
ORf
AND-plane OR-plane
S A lifi
8/10/2019 VD_chap_12
67/70
Sense Amplifiers
tp
C DV
Iav
----------------=make DV as smallas possible
smalllarge
Idea: Use Sense Amplifer
outputinput
s.a.small
transition
8/10/2019 VD_chap_12
68/70
Diff ti l S A lifi
8/10/2019 VD_chap_12
69/70
Differential Sense Amplifier
Directly applicable to
SRAMs
M4
M1
M5
M3
M2
VDD
bitbit
SE
Outy
Differential Sensing SRAM
8/10/2019 VD_chap_12
70/70
Differential Sensing SRAMVDD
VDD
VDD
VDD
BL
EQ
Diff.
Sense
Amp
(a) SRAM sensing scheme (b) two stage differential amplifier
SRAM cell i
WL i
2xx
VDD
Output
BL
PC
M3
M1
M5
M2
M4
x
SE
SE
SE
Output
SE
x2x 2x
y
y
2y