VD_chap_12

8/10/2019 VD_chap_12

1/70

Digital Integrated

CircuitsA Design Perspective

Designing Memory

And Array

Structures

Jan M. RabaeyAnantha Chandrakasan

Borivoje Nikolic

8/10/2019 VD_chap_12

2/70

Chapter Overview

Memory Classification

Memory Architectures

The Memory CorePeriphery

Reliability

Case Studies

8/10/2019 VD_chap_12

3/70


Type of memory unit that is preferable for

a given application is a function of the

required memory size, the time it takes to

access the stored data, the access patterns,the application and the system

requirements.

8/10/2019 VD_chap_12

4/70

Size

Different means are used to express the sizeof a memory unit.

The circuit designer tends to define the size

of a memory in terms of bits. The chip designer expresses the memory size

in bytes or its multiples-kilobytes,megabytes, gigabytes and ultimately

terabytes. The system designer likes to quote the

storage requirement in terms of words, whichrepresent a basic computational entity.

8/10/2019 VD_chap_12

5/70

Timing Parameters

Time it takes to read from the memoryis called the read-access time which is

equal to the delay between the read

request and the moment the data isavailable at the output.

8/10/2019 VD_chap_12

6/70

Function

Often classified on the basis ofmemory functionality, access patterns

and the nature of the storage

mechanism.A distinction is made between read

only(ROM) and read write(RWM)

memories. RWM belong to the class volatile

memory

ROMNon volatile memory

8/10/2019 VD_chap_12

7/70

Access Pattern

Based on the order in which data canbe accessed

Random access class which means

memory can be read or written in arandom order.

Some memory types restrict the orderof access, which results in either

faster access times, smaller area or amemory with special functionality.Examples such as FIFO, LIFO and

shift register.

8/10/2019 VD_chap_12

8/70


Read-Write MemoryNon-Volatile

Read-Write

Memory

Read-Only Memory

EPROM

E2PROM

FLASH

RandomAccess

Non-RandomAccess

SRAM

DRAM

Mask-Programmed

Programmable (PROM)

FIFO

Shift Register

CAM

LIFO

8/10/2019 VD_chap_12

9/70

8/10/2019 VD_chap_12

10/70

Memory Architectures and

Building blocks When implementing an

N-word memory whereeach word is M bit

wide. While this approach is

relatively simple andworks for very small

memories, one runsinto a number ofproblems when tryingto use it for largermemories.

8/10/2019 VD_chap_12

11/70

Memory Architectures and

Building blocksAssume that we would like to

implement a memory that holds 1

million(N=10^6) 8-bit words.

The actual number of words equal2^20=1024*1024=1,048,576.

A decoder is inserted to reduce the

number of select signals.

8/10/2019 VD_chap_12

12/70

Architecture for N word Memory

Memory word isselected by providinga binary encoded

address word. Decoder translates

this address into

N=2^k select lines,only one of which isactive at a time.

8/10/2019 VD_chap_12

13/70

Architecture for N word Memory

While this resolves the selectionproblem, it does not address the issue

of the memory aspect ratio.

Evaluation of the dimensions of thestorage array of our example shows

that its height is approximately

128,000 times larger than itswidth.(the shape of the basic storage

cell is approximately square which is

almost always the case)

8/10/2019 VD_chap_12

14/70

Array-Structured Memory

ArchitectureProblem: ASPECT RATIO or HEIGHT >> WIDTH

Amplify swing torail-to-rail amplitude

Selects appropriateword

8/10/2019 VD_chap_12

15/70

Array-Structured Memory

Architecture Multiple words are stored in a single row

and are selected simultaneously. To route the correct word to the

input/output terminals, an extra piece of

circuitry called the column decoder isneeded.

The address word is partitioned into acolumn address and a row address.

The row address enables one row of thememory for R/W, while the columnaddress picks one particular word fromthe selected row.

8/10/2019 VD_chap_12

16/70

Hierarchical Memory Architecture

Advantages:

1. Shorter wires within blocks2. Block address activates only 1 block => power savings

8/10/2019 VD_chap_12

17/70

Hierarchical Memory

ArchitectureAs an example, a 4 Mbit SRAM can

be designed as a composition of

______ blocks each of which contains

128Kbits. Number of rows

Number of columns

8/10/2019 VD_chap_12

18/70

8/10/2019 VD_chap_12

19/70

Content Addressable Memory

Supports three modes of operation:read, write and match

Comparand block is filled with the data

pattern to match Mask indicates which bits are

significant.

Every row that matches the pattern ispassed to the validity block.

Only valid rows that match are passedto the priority encoder.

8/10/2019 VD_chap_12

20/70

Read-Only Memory

The fact that the contents of a ROMcell are permanently fixed

considerably simplifies its design.

The cell should be designed so that a0 or 1 is presented to the bit line upon

activation of its word line.

8/10/2019 VD_chap_12

21/70

Read-Only Memory Cells

WL

BL

WL

BL

1WL

BL

WL

BL

WL

BL

0

VDD

WL

BL

GND

Diode ROM MOS ROM 1 MOS ROM 2

8/10/2019 VD_chap_12

22/70

A 4 x 4 OR ROM cell array

WL [0]

VDD

BL [0]

WL [1]

WL [2]

WL [3]

Vbias

BL [1]

Pull-down loads

BL [2] BL [3]

VDD

8/10/2019 VD_chap_12

23/70

MOS NOR ROM

WL [0]

GND

BL [0]

WL [1]

WL [2]

WL [3]

VDD

BL [1]

Pull-up devices

BL [2] BL [3]

GND

8/10/2019 VD_chap_12

24/70

MOS NAND ROM

All word lines high by default with exception of selected row

WL [0]

WL [1]

WL [2]

WL [3]

VDDPull-up devices

BL [3]BL [2]BL [1]BL [0]

8/10/2019 VD_chap_12

25/70

Non Volatile Read Write

Memories The memory core consists of an array

of transistors placed on a word line or

bit line grid.

The memory is programmed byselectively disabling or enabling some

of those devices.

A modified transistor that permits itsthreshold to be altered electrically is

used.

8/10/2019 VD_chap_12

26/70

Non Volatile Read Write

Memories To reprogram the memory, the

programmed values must be erased,

after which a new programming round

can be started. The method of erasing is the main

differentiating factor between the

various classes of reprogrammablenonvolatile memories.

8/10/2019 VD_chap_12

27/70

Floating-gate transistor

(FAMOS) Structure is similar to a traditional

MOS device, except that an extra

polysilicon strip is inserted between

the gate and channel. This strip is notconnected to anything and is called a

floating gate.

Double the gate oxide thickness,which results in a reduced device

transconductance as well as

increased threshold voltage.

N V l til M i

8/10/2019 VD_chap_12

28/70

Non-Volatile Memories

The Floating-gate transistor

(FAMOS)

Floating gate

Source

Substrate

Gate

Drain

n+ n+_p

tox

tox

Device cross-section Schematic symbol

G

S

D

8/10/2019 VD_chap_12

29/70

Floating-gate transistor

(FAMOS) Important property:- threshold voltage is

programmable

Applying a high voltage between thesource and gate drain terminals createsa high electric field and causesavalanche injection to occur. Electronsacquire sufficient energy and they gettrapped on the floating gate.

This phenomenon can occur with oxidesas thick as 100 nm, which makes itrelatively easy to fabricate the device.

8/10/2019 VD_chap_12

30/70

Erasable Programmable Read

Only Memory(EPROM) Erased by shining ultraviolet light on

the cells through a transparent window

in a package

Erasure process is slow and can takefrom seconds to several minutes,

depending on the intensity of the UV

sourceAnother problem-limited endurance-

the no. of erase/program cycles is

generally limited to a max of one

8/10/2019 VD_chap_12

31/70

Erasable Programmable Read

Only Memory(EPROM) Device threshold might vary with

repeated programming cycles.

Most EPROM therefore contain on

chip circuitry to control the value of thethreshold to within a specified range

during programming.

EPROM cell is extremely simple,making it possible to fabricate large

memories at a low cost.

8/10/2019 VD_chap_12

32/70

8/10/2019 VD_chap_12

33/70

FLOTOX EEPROM

Floating gate

Source

Substratep

Gate

Drain

n1 n1

FLOTOX transistor Fowler-NordheimI-V characteristic

2030 nm

10 nm

-10 V

10 V

I

VGD

8/10/2019 VD_chap_12

34/70

Flash Electrically Erasable

Programmable Read Only

Memory (Flash) Combination of EPROM and

EEPROM approaches.

Most Flash EEPROM devices use the

avalanche hot electron injection

approach to program the devices.

Erasure is performed using Fowler-Nordheim tunneling, as for EEPROM

cells.

8/10/2019 VD_chap_12

35/70



Memory (Flash) The main difference is that erasure is

performed in bulk for the complete

chip or for a sub section of thememory.

Erasing the complete memory core at

once makes it possible to carefullymonitor of the device characteristics

during erasure.

8/10/2019 VD_chap_12

36/70



Memory (Flash) The monitoring control hardware on

the memory chip regularly checks the

value of the threshold during erasureand adjusts the erasure time

dynamically.

This approach is only practical whenerasing large chunks of memory at a

time; hence the flash concept.

8/10/2019 VD_chap_12

37/70

Periphery

DecodersSense Amplifiers

Input/Output BuffersControl / Timing Circuitry

8/10/2019 VD_chap_12

38/70

Address Decoders

Address decoders have a substantialimpact on the speed and power

consumption of the memory.

Two classes of decodersRow decoders-whose task is to

enable one memory row out of 2^M

Column and block decoders

8/10/2019 VD_chap_12

39/70

Row Decoders

Collection of 2Mcomplex logic gates

Consider 10 bit address decoder

The rows with addresses 0 and 511 are enabled by the following logic funct

(N)AND Decoder

NOR Decoder

For a single stage implementation, it can be transformed into a wide

NOR

8/10/2019 VD_chap_12

40/70

Static Decoder design

Implementing a wide NOR function inCMOS is impractical.

Possible solution- pseudo NMOS

design style Splitting a complex gate into two or

more layers most often produces both

a faster and cheaper implementation. Segments of the address are decoded

in a first layer called the pre-decoder.

8/10/2019 VD_chap_12

41/70

Hierarchical Decoders

A2A2

A2A3

WL 0

A2A3A2A3A2A3

A3 A3A 0A0

A0A1A0A1A0A1A0A1

A1 A1

WL 1

Multi-stage implementation improves performance

NAND decoder using

2-input pre-decoders

8/10/2019 VD_chap_12

42/70

Static Decoder

Advantages of using pre-decoder: It reduces the number of transistors

required.

As the number of inputs to the NANDgates is halved, the propagation delay

is reduced.

8/10/2019 VD_chap_12

43/70

Dynamic Decoders

Precharge devices

VDD f

GND

WL3

WL2

WL1

WL0

A0A0

GND

A1A1f

WL3

A0A0 A1A1

WL 2

WL 1

WL 0

VDD

VDD

VDD

VDD

2-input NOR decoder 2-input NAND decoder

8/10/2019 VD_chap_12

44/70

Column and block decoders

Column decoders should match the bit line pitchof the memory array.

The functionality of a column and block decoderis best described as a 2^K input multiplexers

where K stands for the size of the address word.

For read-write arrays, these multiplexers can be

either separate or shared between read and write

operations.

8/10/2019 VD_chap_12

45/70

4-input pass-transistor based column

decoder

Advantages: speed

Only one extra transistor in signal path

Disadvantage: Large transistor count

2-input NOR

decoder

A 0S0

BL 0 BL 1 BL 2 BL 3

A 1

S1

S2

S3

D

Number of transistors for 2k

input decoder= (k+1) 2k+2k

8/10/2019 VD_chap_12

46/70

4-to-1 tree based column decoder

Number of devices drastically reducedDelay increases quadratically with # of sections; prohibitive for large decoders

BL 0 BL 1 BL 2 BL 3

D

A 0

A 0

A1

A 1

Number of transistors for 2^k input decoder= 2k+2k-1+...+4+2=2*(2k-1)

8/10/2019 VD_chap_12

47/70

Read-Write Memories (RAM)

STATIC (SRAM)

DYNAMIC (DRAM)

Data stored as long as supply is applied

Large (6 transistors/cell)

Fast

Periodic refresh required

Small (1-3 transistors/cell)

Slower

St ti R d A

8/10/2019 VD_chap_12

48/70

Static Random Access

Memory Quite Similar to SR latch It requires six transistors per bit

Access to the cell is enabled by word line, which

replaces the clock and controls the two transistors M5and M6, shared between the read and write operation.

Value is stored symmetricallyboth true and

complement are stored on cross-coupled transistors.

SRAM retains value as long as power is applied to

circuit.

8/10/2019 VD_chap_12

49/70

6-transistor CMOS SRAM Cell

WL

BL

VDD

M5M6

M4

M1

M2

M3

BL

QQ

8/10/2019 VD_chap_12

50/70

CMOS SRAM Analysis (Read)WL

BL

VDD

M 5

M 6

M 4

M1V

DDV

DDV

DD

BL

Q = 1Q = 0

Cbit Cbit

Assume that 1 is stored at Q and both bitlines are precharged to 2.5V.

Read Cycle is started by asserting the word

line, enabling both pass transistors M5 and

M6.

8/10/2019 VD_chap_12

51/70

CMOS SRAM Analysis (Write)

BL = 1 BL = 0

Q = 0

Q = 1

M1

M4

M5

M6

VDD

VDD

WL

Assume that a 1 is stored in the cell(or Q=1)

0 is written in the cell by setting BL to 1 and BL to 0

The gates of transistors M1 and M4 stay at Vdd and

GND

8/10/2019 VD_chap_12

52/70

8/10/2019 VD_chap_12

53/70

Performance of SRAM

For SRAM cell, read operation is critical onebecause it requires the discharging of the large bit

line capacitance through the stack of two small

transistors of the selected cell, CBL has to be

discharged through the series combination of M5and M1.

To accelerate the read time, SRAMs use sense

amplifiers. It quickly discharges one of the bit lines.

8/10/2019 VD_chap_12

54/70

Improved MOS SRAM cells

Resistive load SRAM cell. It is alsoknown as the four transistor SRAM cell.

Cross coupled CMOS inverter pair is

replaced by a pair of resistive load NMOSinverters.

This reduces the SRAM cell size by

approximately one third.

8/10/2019 VD_chap_12

55/70

Resistance-load SRAM Cell

M3

RL RL

VDD

WL

Q Q

M1 M2

M4

BL BL

8/10/2019 VD_chap_12

56/70

8/10/2019 VD_chap_12

57/70

Dynamic RAM

Dynamic random access memory (DRAM), a transistorand a capacitor are paired to create a memory cell,

which represents a single bit of data.

For dynamic memory to work, either the CPU or thememory controller has to come along and recharge all of

the capacitors holding a 1 before they discharge. To do

this, the memory controller reads the memory and then

writes it right back. This refresh operation happensautomatically.

3 T i t DRAM C ll

8/10/2019 VD_chap_12

58/70

3-Transistor DRAM Cell

The cell is written to by placing the appropriate data value on BL1 and

asserting the write word line .

The data is retained as charge on capacitance C once WWL is lowered.

When reading the cell, Read word line is raised. The storage transistor

M2 is either on or off depending upon the stored value.

3T DRAM L t

8/10/2019 VD_chap_12

59/70

3T-DRAM Layout

BL2 BL1 GND

RWL

WWL

M3

M2

M1

8/10/2019 VD_chap_12

60/70

Dynamic RAM

Further Simplification in the cell structure are possible atthe expense of a more complex circuit operation.

Bit lines BL1 and BL2 can be merged into single wire

This requires the bit line to be driven both values in a

single cycle.

Another option is to merge the RWL and WWL lines.

1 T i DRAM C ll

8/10/2019 VD_chap_12

61/70

1-Transistor DRAM Cell

Write: C S is charged or discharged by asserting WL and BL.Read: Charge redistribution takes places between bit line and storage capacitance

During a write cycle, the data value is placed on the bit line BL, and the word

line WL is raised.

Before a read operation is performed, the bit line is precharged to a voltage

Vpre.

Upon asserting the word line, a charge distribution takes place.This results in a voltage change on the bit line, direction of which determines

8/10/2019 VD_chap_12

62/70

Programmable logic array

8/10/2019 VD_chap_12

63/70

Programmable logic array

(PLA) Used to implement specialized logic

functions.

A PLA decodes only some addresses

(input values); a ROM decodes alladdresses.

PLA is used for some logic functions.

8/10/2019 VD_chap_12

64/70

PLA structure

AND plane, OR plane, inverters together formcomplete two-level logic functions.

Both AND and OR planes are implemented as

NOR circuits.

Pull-down transistors form programming of

PLA. Transistors may be referred to as

programming tabs.

Programmable Logic Array

8/10/2019 VD_chap_12

65/70

Programmable Logic Array

GND GND GND GND

GND

GND

GND

VDD

VDD

X0X0 X1 f0 f1X1 X2X2

AND-plane OR-plane

Pseudo-NMOS PLA

Dynamic PLA

8/10/2019 VD_chap_12

66/70

Dynamic PLA

GND

GNDVDD

VDD

X0X0 X1 f0 f1X1 X2X2

ANDf

ANDf

ORf

ORf

AND-plane OR-plane

S A lifi

8/10/2019 VD_chap_12

67/70

Sense Amplifiers

tp

C DV

Iav

----------------=make DV as smallas possible

smalllarge

Idea: Use Sense Amplifer

outputinput

s.a.small

transition

8/10/2019 VD_chap_12

68/70

Diff ti l S A lifi

8/10/2019 VD_chap_12

69/70

Differential Sense Amplifier

Directly applicable to

SRAMs

M4

M1

M5

M3

M2

VDD

bitbit

SE

Outy

Differential Sensing SRAM

8/10/2019 VD_chap_12

70/70

Differential Sensing SRAMVDD

VDD

VDD

VDD

BL

EQ

Diff.

Sense

Amp

(a) SRAM sensing scheme (b) two stage differential amplifier

SRAM cell i

WL i

2xx

VDD

Output

BL

PC

M3

M1

M5

M2

M4

x

SE

SE

SE

Output

SE

x2x 2x

y

y

2y

VD_chap_12

Documents