Top Banner

of 70

VD_chap_12

Jun 02, 2018

Download

Documents

Shubham Garg
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/10/2019 VD_chap_12

    1/70

    Digital Integrated

    CircuitsA Design Perspective

    Designing Memory

    And Array

    Structures

    Jan M. RabaeyAnantha Chandrakasan

    Borivoje Nikolic

  • 8/10/2019 VD_chap_12

    2/70

    Chapter Overview

    Memory Classification

    Memory Architectures

    The Memory CorePeriphery

    Reliability

    Case Studies

  • 8/10/2019 VD_chap_12

    3/70

    Memory Classification

    Type of memory unit that is preferable for

    a given application is a function of the

    required memory size, the time it takes to

    access the stored data, the access patterns,the application and the system

    requirements.

  • 8/10/2019 VD_chap_12

    4/70

    Size

    Different means are used to express the sizeof a memory unit.

    The circuit designer tends to define the size

    of a memory in terms of bits. The chip designer expresses the memory size

    in bytes or its multiples-kilobytes,megabytes, gigabytes and ultimately

    terabytes. The system designer likes to quote the

    storage requirement in terms of words, whichrepresent a basic computational entity.

  • 8/10/2019 VD_chap_12

    5/70

    Timing Parameters

    Time it takes to read from the memoryis called the read-access time which is

    equal to the delay between the read

    request and the moment the data isavailable at the output.

  • 8/10/2019 VD_chap_12

    6/70

    Function

    Often classified on the basis ofmemory functionality, access patterns

    and the nature of the storage

    mechanism.A distinction is made between read

    only(ROM) and read write(RWM)

    memories. RWM belong to the class volatile

    memory

    ROMNon volatile memory

  • 8/10/2019 VD_chap_12

    7/70

    Access Pattern

    Based on the order in which data canbe accessed

    Random access class which means

    memory can be read or written in arandom order.

    Some memory types restrict the orderof access, which results in either

    faster access times, smaller area or amemory with special functionality.Examples such as FIFO, LIFO and

    shift register.

  • 8/10/2019 VD_chap_12

    8/70

    Memory Classification

    Read-Write MemoryNon-Volatile

    Read-Write

    Memory

    Read-Only Memory

    EPROM

    E2PROM

    FLASH

    RandomAccess

    Non-RandomAccess

    SRAM

    DRAM

    Mask-Programmed

    Programmable (PROM)

    FIFO

    Shift Register

    CAM

    LIFO

  • 8/10/2019 VD_chap_12

    9/70

  • 8/10/2019 VD_chap_12

    10/70

    Memory Architectures and

    Building blocks When implementing an

    N-word memory whereeach word is M bit

    wide. While this approach is

    relatively simple andworks for very small

    memories, one runsinto a number ofproblems when tryingto use it for largermemories.

  • 8/10/2019 VD_chap_12

    11/70

    Memory Architectures and

    Building blocksAssume that we would like to

    implement a memory that holds 1

    million(N=10^6) 8-bit words.

    The actual number of words equal2^20=1024*1024=1,048,576.

    A decoder is inserted to reduce the

    number of select signals.

  • 8/10/2019 VD_chap_12

    12/70

    Architecture for N word Memory

    Memory word isselected by providinga binary encoded

    address word. Decoder translates

    this address into

    N=2^k select lines,only one of which isactive at a time.

  • 8/10/2019 VD_chap_12

    13/70

    Architecture for N word Memory

    While this resolves the selectionproblem, it does not address the issue

    of the memory aspect ratio.

    Evaluation of the dimensions of thestorage array of our example shows

    that its height is approximately

    128,000 times larger than itswidth.(the shape of the basic storage

    cell is approximately square which is

    almost always the case)

  • 8/10/2019 VD_chap_12

    14/70

    Array-Structured Memory

    ArchitectureProblem: ASPECT RATIO or HEIGHT >> WIDTH

    Amplify swing torail-to-rail amplitude

    Selects appropriateword

  • 8/10/2019 VD_chap_12

    15/70

    Array-Structured Memory

    Architecture Multiple words are stored in a single row

    and are selected simultaneously. To route the correct word to the

    input/output terminals, an extra piece of

    circuitry called the column decoder isneeded.

    The address word is partitioned into acolumn address and a row address.

    The row address enables one row of thememory for R/W, while the columnaddress picks one particular word fromthe selected row.

  • 8/10/2019 VD_chap_12

    16/70

    Hierarchical Memory Architecture

    Advantages:

    1. Shorter wires within blocks2. Block address activates only 1 block => power savings

  • 8/10/2019 VD_chap_12

    17/70

    Hierarchical Memory

    ArchitectureAs an example, a 4 Mbit SRAM can

    be designed as a composition of

    ______ blocks each of which contains

    128Kbits. Number of rows

    Number of columns

  • 8/10/2019 VD_chap_12

    18/70

  • 8/10/2019 VD_chap_12

    19/70

    Content Addressable Memory

    Supports three modes of operation:read, write and match

    Comparand block is filled with the data

    pattern to match Mask indicates which bits are

    significant.

    Every row that matches the pattern ispassed to the validity block.

    Only valid rows that match are passedto the priority encoder.

  • 8/10/2019 VD_chap_12

    20/70

    Read-Only Memory

    The fact that the contents of a ROMcell are permanently fixed

    considerably simplifies its design.

    The cell should be designed so that a0 or 1 is presented to the bit line upon

    activation of its word line.

  • 8/10/2019 VD_chap_12

    21/70

    Read-Only Memory Cells

    WL

    BL

    WL

    BL

    1WL

    BL

    WL

    BL

    WL

    BL

    0

    VDD

    WL

    BL

    GND

    Diode ROM MOS ROM 1 MOS ROM 2

  • 8/10/2019 VD_chap_12

    22/70

    A 4 x 4 OR ROM cell array

    WL [0]

    VDD

    BL [0]

    WL [1]

    WL [2]

    WL [3]

    Vbias

    BL [1]

    Pull-down loads

    BL [2] BL [3]

    VDD

  • 8/10/2019 VD_chap_12

    23/70

    MOS NOR ROM

    WL [0]

    GND

    BL [0]

    WL [1]

    WL [2]

    WL [3]

    VDD

    BL [1]

    Pull-up devices

    BL [2] BL [3]

    GND

  • 8/10/2019 VD_chap_12

    24/70

    MOS NAND ROM

    All word lines high by default with exception of selected row

    WL [0]

    WL [1]

    WL [2]

    WL [3]

    VDDPull-up devices

    BL [3]BL [2]BL [1]BL [0]

  • 8/10/2019 VD_chap_12

    25/70

    Non Volatile Read Write

    Memories The memory core consists of an array

    of transistors placed on a word line or

    bit line grid.

    The memory is programmed byselectively disabling or enabling some

    of those devices.

    A modified transistor that permits itsthreshold to be altered electrically is

    used.

  • 8/10/2019 VD_chap_12

    26/70

    Non Volatile Read Write

    Memories To reprogram the memory, the

    programmed values must be erased,

    after which a new programming round

    can be started. The method of erasing is the main

    differentiating factor between the

    various classes of reprogrammablenonvolatile memories.

  • 8/10/2019 VD_chap_12

    27/70

    Floating-gate transistor

    (FAMOS) Structure is similar to a traditional

    MOS device, except that an extra

    polysilicon strip is inserted between

    the gate and channel. This strip is notconnected to anything and is called a

    floating gate.

    Double the gate oxide thickness,which results in a reduced device

    transconductance as well as

    increased threshold voltage.

    N V l til M i

  • 8/10/2019 VD_chap_12

    28/70

    Non-Volatile Memories

    The Floating-gate transistor

    (FAMOS)

    Floating gate

    Source

    Substrate

    Gate

    Drain

    n+ n+_p

    tox

    tox

    Device cross-section Schematic symbol

    G

    S

    D

  • 8/10/2019 VD_chap_12

    29/70

    Floating-gate transistor

    (FAMOS) Important property:- threshold voltage is

    programmable

    Applying a high voltage between thesource and gate drain terminals createsa high electric field and causesavalanche injection to occur. Electronsacquire sufficient energy and they gettrapped on the floating gate.

    This phenomenon can occur with oxidesas thick as 100 nm, which makes itrelatively easy to fabricate the device.

  • 8/10/2019 VD_chap_12

    30/70

    Erasable Programmable Read

    Only Memory(EPROM) Erased by shining ultraviolet light on

    the cells through a transparent window

    in a package

    Erasure process is slow and can takefrom seconds to several minutes,

    depending on the intensity of the UV

    sourceAnother problem-limited endurance-

    the no. of erase/program cycles is

    generally limited to a max of one

  • 8/10/2019 VD_chap_12

    31/70

    Erasable Programmable Read

    Only Memory(EPROM) Device threshold might vary with

    repeated programming cycles.

    Most EPROM therefore contain on

    chip circuitry to control the value of thethreshold to within a specified range

    during programming.

    EPROM cell is extremely simple,making it possible to fabricate large

    memories at a low cost.

  • 8/10/2019 VD_chap_12

    32/70

  • 8/10/2019 VD_chap_12

    33/70

    FLOTOX EEPROM

    Floating gate

    Source

    Substratep

    Gate

    Drain

    n1 n1

    FLOTOX transistor Fowler-NordheimI-V characteristic

    2030 nm

    10 nm

    -10 V

    10 V

    I

    VGD

  • 8/10/2019 VD_chap_12

    34/70

    Flash Electrically Erasable

    Programmable Read Only

    Memory (Flash) Combination of EPROM and

    EEPROM approaches.

    Most Flash EEPROM devices use the

    avalanche hot electron injection

    approach to program the devices.

    Erasure is performed using Fowler-Nordheim tunneling, as for EEPROM

    cells.

  • 8/10/2019 VD_chap_12

    35/70

    Flash Electrically Erasable

    Programmable Read Only

    Memory (Flash) The main difference is that erasure is

    performed in bulk for the complete

    chip or for a sub section of thememory.

    Erasing the complete memory core at

    once makes it possible to carefullymonitor of the device characteristics

    during erasure.

  • 8/10/2019 VD_chap_12

    36/70

    Flash Electrically Erasable

    Programmable Read Only

    Memory (Flash) The monitoring control hardware on

    the memory chip regularly checks the

    value of the threshold during erasureand adjusts the erasure time

    dynamically.

    This approach is only practical whenerasing large chunks of memory at a

    time; hence the flash concept.

  • 8/10/2019 VD_chap_12

    37/70

    Periphery

    DecodersSense Amplifiers

    Input/Output BuffersControl / Timing Circuitry

  • 8/10/2019 VD_chap_12

    38/70

    Address Decoders

    Address decoders have a substantialimpact on the speed and power

    consumption of the memory.

    Two classes of decodersRow decoders-whose task is to

    enable one memory row out of 2^M

    Column and block decoders

  • 8/10/2019 VD_chap_12

    39/70

    Row Decoders

    Collection of 2Mcomplex logic gates

    Consider 10 bit address decoder

    The rows with addresses 0 and 511 are enabled by the following logic funct

    (N)AND Decoder

    NOR Decoder

    For a single stage implementation, it can be transformed into a wide

    NOR

  • 8/10/2019 VD_chap_12

    40/70

    Static Decoder design

    Implementing a wide NOR function inCMOS is impractical.

    Possible solution- pseudo NMOS

    design style Splitting a complex gate into two or

    more layers most often produces both

    a faster and cheaper implementation. Segments of the address are decoded

    in a first layer called the pre-decoder.

  • 8/10/2019 VD_chap_12

    41/70

    Hierarchical Decoders

    A2A2

    A2A3

    WL 0

    A2A3A2A3A2A3

    A3 A3A 0A0

    A0A1A0A1A0A1A0A1

    A1 A1

    WL 1

    Multi-stage implementation improves performance

    NAND decoder using

    2-input pre-decoders

  • 8/10/2019 VD_chap_12

    42/70

    Static Decoder

    Advantages of using pre-decoder: It reduces the number of transistors

    required.

    As the number of inputs to the NANDgates is halved, the propagation delay

    is reduced.

  • 8/10/2019 VD_chap_12

    43/70

    Dynamic Decoders

    Precharge devices

    VDD f

    GND

    WL3

    WL2

    WL1

    WL0

    A0A0

    GND

    A1A1f

    WL3

    A0A0 A1A1

    WL 2

    WL 1

    WL 0

    VDD

    VDD

    VDD

    VDD

    2-input NOR decoder 2-input NAND decoder

  • 8/10/2019 VD_chap_12

    44/70

    Column and block decoders

    Column decoders should match the bit line pitchof the memory array.

    The functionality of a column and block decoderis best described as a 2^K input multiplexers

    where K stands for the size of the address word.

    For read-write arrays, these multiplexers can be

    either separate or shared between read and write

    operations.

  • 8/10/2019 VD_chap_12

    45/70

    4-input pass-transistor based column

    decoder

    Advantages: speed

    Only one extra transistor in signal path

    Disadvantage: Large transistor count

    2-input NOR

    decoder

    A 0S0

    BL 0 BL 1 BL 2 BL 3

    A 1

    S1

    S2

    S3

    D

    Number of transistors for 2k

    input decoder= (k+1) 2k+2k

  • 8/10/2019 VD_chap_12

    46/70

    4-to-1 tree based column decoder

    Number of devices drastically reducedDelay increases quadratically with # of sections; prohibitive for large decoders

    BL 0 BL 1 BL 2 BL 3

    D

    A 0

    A 0

    A1

    A 1

    Number of transistors for 2^k input decoder= 2k+2k-1+...+4+2=2*(2k-1)

  • 8/10/2019 VD_chap_12

    47/70

    Read-Write Memories (RAM)

    STATIC (SRAM)

    DYNAMIC (DRAM)

    Data stored as long as supply is applied

    Large (6 transistors/cell)

    Fast

    Periodic refresh required

    Small (1-3 transistors/cell)

    Slower

    St ti R d A

  • 8/10/2019 VD_chap_12

    48/70

    Static Random Access

    Memory Quite Similar to SR latch It requires six transistors per bit

    Access to the cell is enabled by word line, which

    replaces the clock and controls the two transistors M5and M6, shared between the read and write operation.

    Value is stored symmetricallyboth true and

    complement are stored on cross-coupled transistors.

    SRAM retains value as long as power is applied to

    circuit.

  • 8/10/2019 VD_chap_12

    49/70

    6-transistor CMOS SRAM Cell

    WL

    BL

    VDD

    M5M6

    M4

    M1

    M2

    M3

    BL

    QQ

  • 8/10/2019 VD_chap_12

    50/70

    CMOS SRAM Analysis (Read)WL

    BL

    VDD

    M 5

    M 6

    M 4

    M1V

    DDV

    DDV

    DD

    BL

    Q = 1Q = 0

    Cbit Cbit

    Assume that 1 is stored at Q and both bitlines are precharged to 2.5V.

    Read Cycle is started by asserting the word

    line, enabling both pass transistors M5 and

    M6.

  • 8/10/2019 VD_chap_12

    51/70

    CMOS SRAM Analysis (Write)

    BL = 1 BL = 0

    Q = 0

    Q = 1

    M1

    M4

    M5

    M6

    VDD

    VDD

    WL

    Assume that a 1 is stored in the cell(or Q=1)

    0 is written in the cell by setting BL to 1 and BL to 0

    The gates of transistors M1 and M4 stay at Vdd and

    GND

  • 8/10/2019 VD_chap_12

    52/70

  • 8/10/2019 VD_chap_12

    53/70

    Performance of SRAM

    For SRAM cell, read operation is critical onebecause it requires the discharging of the large bit

    line capacitance through the stack of two small

    transistors of the selected cell, CBL has to be

    discharged through the series combination of M5and M1.

    To accelerate the read time, SRAMs use sense

    amplifiers. It quickly discharges one of the bit lines.

  • 8/10/2019 VD_chap_12

    54/70

    Improved MOS SRAM cells

    Resistive load SRAM cell. It is alsoknown as the four transistor SRAM cell.

    Cross coupled CMOS inverter pair is

    replaced by a pair of resistive load NMOSinverters.

    This reduces the SRAM cell size by

    approximately one third.

  • 8/10/2019 VD_chap_12

    55/70

    Resistance-load SRAM Cell

    M3

    RL RL

    VDD

    WL

    Q Q

    M1 M2

    M4

    BL BL

  • 8/10/2019 VD_chap_12

    56/70

  • 8/10/2019 VD_chap_12

    57/70

    Dynamic RAM

    Dynamic random access memory (DRAM), a transistorand a capacitor are paired to create a memory cell,

    which represents a single bit of data.

    For dynamic memory to work, either the CPU or thememory controller has to come along and recharge all of

    the capacitors holding a 1 before they discharge. To do

    this, the memory controller reads the memory and then

    writes it right back. This refresh operation happensautomatically.

    3 T i t DRAM C ll

  • 8/10/2019 VD_chap_12

    58/70

    3-Transistor DRAM Cell

    The cell is written to by placing the appropriate data value on BL1 and

    asserting the write word line .

    The data is retained as charge on capacitance C once WWL is lowered.

    When reading the cell, Read word line is raised. The storage transistor

    M2 is either on or off depending upon the stored value.

    3T DRAM L t

  • 8/10/2019 VD_chap_12

    59/70

    3T-DRAM Layout

    BL2 BL1 GND

    RWL

    WWL

    M3

    M2

    M1

  • 8/10/2019 VD_chap_12

    60/70

    Dynamic RAM

    Further Simplification in the cell structure are possible atthe expense of a more complex circuit operation.

    Bit lines BL1 and BL2 can be merged into single wire

    This requires the bit line to be driven both values in a

    single cycle.

    Another option is to merge the RWL and WWL lines.

    1 T i DRAM C ll

  • 8/10/2019 VD_chap_12

    61/70

    1-Transistor DRAM Cell

    Write: C S is charged or discharged by asserting WL and BL.Read: Charge redistribution takes places between bit line and storage capacitance

    During a write cycle, the data value is placed on the bit line BL, and the word

    line WL is raised.

    Before a read operation is performed, the bit line is precharged to a voltage

    Vpre.

    Upon asserting the word line, a charge distribution takes place.This results in a voltage change on the bit line, direction of which determines

  • 8/10/2019 VD_chap_12

    62/70

    Programmable logic array

  • 8/10/2019 VD_chap_12

    63/70

    Programmable logic array

    (PLA) Used to implement specialized logic

    functions.

    A PLA decodes only some addresses

    (input values); a ROM decodes alladdresses.

    PLA is used for some logic functions.

  • 8/10/2019 VD_chap_12

    64/70

    PLA structure

    AND plane, OR plane, inverters together formcomplete two-level logic functions.

    Both AND and OR planes are implemented as

    NOR circuits.

    Pull-down transistors form programming of

    PLA. Transistors may be referred to as

    programming tabs.

    Programmable Logic Array

  • 8/10/2019 VD_chap_12

    65/70

    Programmable Logic Array

    GND GND GND GND

    GND

    GND

    GND

    VDD

    VDD

    X0X0 X1 f0 f1X1 X2X2

    AND-plane OR-plane

    Pseudo-NMOS PLA

    Dynamic PLA

  • 8/10/2019 VD_chap_12

    66/70

    Dynamic PLA

    GND

    GNDVDD

    VDD

    X0X0 X1 f0 f1X1 X2X2

    ANDf

    ANDf

    ORf

    ORf

    AND-plane OR-plane

    S A lifi

  • 8/10/2019 VD_chap_12

    67/70

    Sense Amplifiers

    tp

    C DV

    Iav

    ----------------=make DV as smallas possible

    smalllarge

    Idea: Use Sense Amplifer

    outputinput

    s.a.small

    transition

  • 8/10/2019 VD_chap_12

    68/70

    Diff ti l S A lifi

  • 8/10/2019 VD_chap_12

    69/70

    Differential Sense Amplifier

    Directly applicable to

    SRAMs

    M4

    M1

    M5

    M3

    M2

    VDD

    bitbit

    SE

    Outy

    Differential Sensing SRAM

  • 8/10/2019 VD_chap_12

    70/70

    Differential Sensing SRAMVDD

    VDD

    VDD

    VDD

    BL

    EQ

    Diff.

    Sense

    Amp

    (a) SRAM sensing scheme (b) two stage differential amplifier

    SRAM cell i

    WL i

    2xx

    VDD

    Output

    BL

    PC

    M3

    M1

    M5

    M2

    M4

    x

    SE

    SE

    SE

    Output

    SE

    x2x 2x

    y

    y

    2y