Top Banner

of 28

Shawn Present

Apr 14, 2018

Download

Documents

Amit Choudhary
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/27/2019 Shawn Present

    1/28

    Low-power, High-speedMultiplier Architectures

    Shawn Nicholl

    ELEC-5705yMarch 7, 2005

  • 7/27/2019 Shawn Present

    2/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 2

    Agenda/Overview

    Design Abstraction Numbering Systems

    Addition and Subtraction

    Adder Architectures

    Multiplication

    Traditional Multiplier Architectures

    Advanced Multiplier Architectures

  • 7/27/2019 Shawn Present

    3/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 3

    Levels of Abstraction in Digital ICs

    Higher levels of abstraction have greatereffect on overall system performance

    Systems

    Modules

    Logic Gates

    Circuits

    Devices

    Low-power, high-speed techniques can be

    used at many levels of abstraction

    Inc

    reasing

    Abs

    traction

    Multiplier Architectures

  • 7/27/2019 Shawn Present

    4/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 4

    Numbering Systems A Quick Review

    Decimal

    1

    0

    10

    n

    i

    i

    idD

    1

    0

    2

    n

    i

    i

    ibB

    Range: 0 to 10n-1

    Range: 0 to 2n-1

    Range: -2n-1

    to +(2n-1

    1)

    Some common numbering systems:

    UnsignedBinary

    Twos-Complement

    Sign Decimal Sign Unsigned Binary Sign Twos Complement

    + 10 + 0000 1010 N/A 0000 1010

    - 45 - 0010 1101 N/A 1101 0011

    1 1 0 1 0 0 1 1

    1 1 0 1 0 0 1 0

    1

    2s Comp

    45d = 0+0+25+0+2

    3+2

    2+0+2

    0

    0 0 1 0 1 1 0 1

    Eg.

  • 7/27/2019 Shawn Present

    5/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 5

    Adding and Subtracting

    Twos-complement algorithm is consistent Addition and subtraction and behave the same

    Negative numbers treated same as positive numbers

    Example: Add45d to 10d

    10d-45d

    -45d10d

    45d-10d

    45d-10d35d

    -35d

    Step1) Initialize

    Step2) Compare so that augendholds larger number

    Step3) Treat as a subtraction

    Step4) Do subtraction (borrowsmay be required)

    Step5) Negate result (knowing thataugend was negative)

    Twos Complement

    Method

    Step1) Initialize

    Step2) Add(no special rules)

    10d = 0000 1010b-45d = 1101 0011b

    0000 1010b1101 0011b

    1101 1101b

    Converting 2s Comp back to decimal:1101 1101b = -35d

  • 7/27/2019 Shawn Present

    6/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 6

    Adding and Subtracting (Example 2)

    Example2: Subtract45d from 10d

    10d- -45d

    10d+ 45d

    55d

    Step1) Initialize

    Step2) Subtrahend is negative,so negate it and do an addition

    Signed Decimal Method Twos Complement Method

    10d = 0000 1010b-45d = 1101 0011b

    1b0000 1010b0010 1100b0011 0111b

    Converting 2s Comp back to decimal:0011 0111b = 55d

    Step1) Initialize

    Step2) Invertsubtrahend and setCIN = 1

    Subtraction logic can be shared withaddition logic!

  • 7/27/2019 Shawn Present

    7/282005/03/07 Low-Power, High-Speed Multiplier Architectures 7

    Adder Building Blocks

    Half AdderSn = An BnCOn = An Bn

    An

    Bn

    COn

    Sn

    SnCIN

    n

    COUTn

    AnB

    n

    Full AdderSn = An Bn CINnCOUTn = An Bn CINn

  • 7/27/2019 Shawn Present

    8/282005/03/07 Low-Power, High-Speed Multiplier Architectures 8

    Adder Architectures (CRA)

    Carry Ripple Adder (CRA)

    Gate Count N Area N Delay N Power N Layout friendly (low fan-in/fan-out; regular structure)

    AN

    BN

    SN

    FACOUTN

    CIN0

    A1

    B1

    S1

    FA

    A0

    B0

    S0

    FA

  • 7/27/2019 Shawn Present

    9/282005/03/07 Low-Power, High-Speed Multiplier Architectures 9

    Adder Architectures (CLA)

    Carry Lookahead Adder (CLA)

    Generate: Gn = An Bn Propagate: Pn = An + Bn

    Recursive Relationship:

    CINn = Gn-1 + Pn-1 CINn-1

    Generates

    Propagates 1

    CINn = Gn-1 + Pn-1Gn-2 + Pn-1Pn-2P1G0 + Pn-1Pn-2P0CIN0

    CLA: Delay log2N

    (if built right)

    Gate count, power aregreater than CRA

    Not layout friendly (highfan-in; difficult to route)

    GN-1

    PN-1

    CIN0

    P0

    P1

    PN-1

    PN-1

    GN-3

    PN-1

    P1

    P2

    G0

    PN-2

    GN-2

    CINN

    AN BN A1 B1 A0 B0

    SN

    S1

    S0

    Source:Patterson and Hennessy,Figure A.14

    Stage n

    CINn

    Stage n

    CINn

    Stage n-1

    Stage n-1

    CINn

  • 7/27/2019 Shawn Present

    10/282005/03/07 Low-Power, High-Speed Multiplier Architectures 10

    Adder Architectures (CSA)

    Carry Save Adder Adders work

    independently, sovery fast

    Pipelinedarchitectureresults in flops andcontrol logic,which increasearea and latency

    CIN0A0 B0

    S0

    FA

    COUT0

    CIN1A1 B1

    S1

    FA

    COUT1

    CINN-1AN-1 BN-1

    SN-1

    FA

    COUTN-1

    CINNAN BN

    SN

    FA

    COUTN

    FAFAFAFA

    FAFAFAFA

    FAFAFAFA

  • 7/27/2019 Shawn Present

    11/282005/03/07 Low-Power, High-Speed Multiplier Architectures 11

    Unsigned Multiplication

    Shift-and-AddAlgorithm

    Example: Multiply 118d by 99d

    Multiplicand

    Multiplier

    Step1) Initialize

    Step2) Find partial products

    Step3) Sum up the shifted

    partial products

    118d99d

    1062d1062 d11682d

    Twos ComplementMethod

    Step1) Initialize

    Step2) Find partialproducts

    Step3) Sum up theshifted partialproducts

    118d = 0111 0110b99d = 0110 0011b

    01110110b

    Convert 2s-Comp back to decimal:0010 1101 1010 0010 = 11682d

    00000000 b00000000 b

    01110110 b01110110 b

    00000000 b010110110100010 b

    01110110 b00000000 b

  • 7/27/2019 Shawn Present

    12/282005/03/07 Low-Power, High-Speed Multiplier Architectures 12

    Shift-and-Add Multiplier

    A

    B

    SCOUT

    Anx B

    N-bit Adder

    N N

    Load BLoad A

    P

    N

    N

    N

    N

    N

    N

    N+1

    1

    2N

    Shift

    Add

    B MultiplicandX A Multiplier

    P Product

    Shift-and-AddMultiplier

    Take N cyclesto complete:

    TLat= (TN-bitADD+Tshift)xN

    Requiresminimal logic(most logic isin the adder)

  • 7/27/2019 Shawn Present

    13/282005/03/07 Low-Power, High-Speed Multiplier Architectures 13

    A B

    Shift-and-Add

    Multiplier

    Convert to

    Unsigned

    Convert to

    Unsigned

    Determine

    Sign of Result

    Convert to

    Signed

    P

    2N

    NN

    Basic Signed Multiplication

    ExtraHardware!

    Basic Idea1. Convert to Unsigned2. Use Shift-and-Add

    Multiplier

    3. Convert to Signed

  • 7/27/2019 Shawn Present

    14/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 14

    Signed Multiplication

    Booth Recoding

    Reduce the number of partial products byre-coding the multiplier operand

    Works for signed numbers

    Example: Multiply -118d by -99d

    Recall, 99d = 0110 0011b

    1001 1100b

    1b-99d = 1001 1101bRadix-2BoothRecoding

    0101 1110-99d =

    An An-1PartialProduct

    0 0 0

    0 1 +B

    1 0 -B

    1 1 0

    Low-order Bit

    Last Bit Shifted Out

  • 7/27/2019 Shawn Present

    15/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 15

    Radix-2 Booth Multiplication

    Radix-2 Booth

    Step1) Initialize

    Step2) Find partialproducts

    Step3) Sum up theshifted partialproducts

    -118d = 0111 0110b

    01110110b

    Convert 2s-Comp back to decimal:0010 1101 1010 0010 = 11682d

    00000000 b00000000 b

    1110001010 b000000000 b

    01110110 b0010110110100010 b

    110001010 b01110110 b

    0101 1110-99d =

    -B

    B-B00B0

    -B

    B = -118d = 1000 1010b-B = 118d = 0111 0110b

    A = -99d = 1001 1101b

    Example: Multiply -118d by -99d

    Sign Extension

    0101 1110-99d =

  • 7/27/2019 Shawn Present

    16/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 16

    Array Multiplier

    Array Multiplier Combinatorial, so it is very

    fast delay N Can be pipelined

    Very regular structure

    -118d = 0111 0110b

    01110110b

    00000000 b00000000 b

    1110001010 b000000000 b

    01110110 b0010110110100010 b

    110001010 b

    01110110 b

    0101 1110-99d =-BB

    -B00

    B0

    -B

    01110110b

    110001010 b01110110 b

    -B

    B-B

    FA FAFAFA

    CSA

    CSA

    CSA

    CSA

    CSA

    CPA

    00000000 b 0

    00000000 b 0

    1110001010 b B

    000000000 b 0

    01110110 b -B

  • 7/27/2019 Shawn Present

    17/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 17

    Array Multiplier Structure

    Source: J. Kuo and J. Lou, Low-Voltage CMOS VLSI Circuits, 1999

  • 7/27/2019 Shawn Present

    18/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 18

    Radix-4 Booth Multiplication

    Similar to Radix-2, butuses looks at two low-order bits at a time(instead of 1) A2n+1 A2n A2n-1

    PartialProduct

    0 0 0 00 0 1 +B

    0 1 0 +B

    0 1 1 +2B

    1 0 0 -2B

    1 0 1 -B

    1 1 0 -B

    1 1 1 0

    Low-order Bits

    Last Bit Shifted Out

    Recall, 99d = 0110 0011b

    1001 1100b1b

    -99d = 1001 1101bRadix-4BoothRecoding

    -99d = 1122

  • 7/27/2019 Shawn Present

    19/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 19

    Radix-4 Booth Multiplication

    Radix-4 Booth

    Step1) Initialize

    Step2) Find partialproducts

    Step3) Sum up theshifted partialproducts

    -118d = 0111 0110b

    Convert 2s-Comp back to decimal:0010 1101 1010 0010 = 11682d

    111111110001010b

    011101100 b0010110110100010 b

    01110110 b

    11100010100 b

    B-B2B-2B

    B = -118d = 1000 1010b-B = 118d = 0111 0110b

    2B = -236d = 1 0001 0100b-2B = 236d = 0 1110 1100b

    A = -99d = 1001 1101b

    Example: Multiply -118d by -99d

    Sign Extension

    -99d = 1122

    -99d = 1122

    Reduces number of partial products by half!

  • 7/27/2019 Shawn Present

    20/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 20

    Tree Multiplier

    Wallace Tree Reduces the total number

    of full-adders

    Uses 3:2 Compressor(aka Full Adder)

    Delay log3/2N Irregular structure is

    difficult to layout

    Source: J. Kuo, et. al., Low-Voltage CMOS VLSI Circuits, 1999

    B7A

    0B

    0A

    0

    B7A

    8B

    0A

    8

    B7A

    0B

    0A

    0B

    7A

    8

    B0A

    8

    Original

    Structure

    Tree

    Structure

  • 7/27/2019 Shawn Present

    21/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 21

    Twin Pipe Serial-Parallel Multiplier

    Features

    Source: S. Shah, et.al., Comparison of 32-bit Multipliers for Various Performance Measures, 2000.

    Even data

    bits on risingclock

    Odd databits onfalling clock

    Parallel FeedOne Operand

    Serial FeedOne Operand

    Low Area High latency Low Power

  • 7/27/2019 Shawn Present

    22/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 22

    Cluster Multiplication

    Divide circuit intoclusters of nibble-wide multiplications

    If all bits in a nibble

    are zeroes, thenuse clock-gating togate multiplicationfor that nibble

    A0

    B0

    A1

    B1

    A(N-1)

    B(N-1)

    A(N-1)xB0 A1xB0 A0xB0

    A(N-1)xB1 A1xB1 A0xB1

    A(N-1)x

    B(N-1)A1xB(N-1) A0xB(N-1)

    4 44

    4

    4

    4

    Source: A. Fayed, M. Bayoumi, A Novel Architecture forLow-Power Design of Parallel Multipliers, 2001.

    Features Low Power

    (claims 13% savings)

  • 7/27/2019 Shawn Present

    23/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 23

    Multiplexer-Based Array Multiplier

    Characteristics Fast (because it is

    array-based)

    Unlike Booth, does

    not requireencoding logic Source: K. Pekmestzi, Multiplexer-Based Array Multipliers, 1999.

    Processes 1 bit of multiplier and 1 bit of multiplicand at a time,thus it is symmetric

    Has a zigzag shape, thus not layout-friendly

  • 7/27/2019 Shawn Present

    24/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 24

    Area-Efficient Multiplexer-Based Multiplier

    Characteristics

    Increases each row to have N+1 cells (instead of N)

    Depth is cut in half (increases squareness)

    Source:Y. Wang, Y. Jiang, E. Sha, On Area-Efficient Low Power Array Multipliers, 2001.

  • 7/27/2019 Shawn Present

    25/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 25

    Low Latency Booth-Encoding-based Pipeline Multiplier

    Features Delay N/4 Needs (N+N/2)-bit

    addition at end

    Uses CLAs instead ofCSAs because longest

    stage (i.e. adder atend) determines fastestoperating frequency

    Source: X. Wu, H. Chen, S. Wei, Design of a Low Latency HighSpeed Pipelining Multiplier, 2001.

  • 7/27/2019 Shawn Present

    26/28

    2005/03/07 Low-Power, High-Speed Multiplier Architectures 26

    Twos Complement Gray-Encoded Array Multiplier

    Characteristics Uses gray code to

    reduce theswitching activity ofmultiplier

    Claims thattraditional Boothuses 45% morepower

    Greater area thantraditional Booth

    Source: E. Costa, et.al., A New Architecture for 2s Complement Gray Encoded Array Multiplier, 2002.

  • 7/27/2019 Shawn Present

    27/28

  • 7/27/2019 Shawn Present

    28/28

    2005/03/07 Low Power High Speed Multiplier Architectures 28

    References

    S. Shah, A.J. Al-Khalili, D. Al-Khalili, Comparison of 32-bit Multipliers for VariousPerformance Measures, Proc. 2000 Intl Conf. Microelectronics, pp. 75-80, 2000.

    D. Patterson, J. Hennessy, 2nd, ed., Computer ArchitectureA Quantitative Approach,San Francisco, CA: Morgan Kaufmann Publishers, Inc., 1996.

    X. Wu, H. Chen, S. Wei, Design of a Low Latency High Speed Pipelining Multiplier,Proc. 2001 Intl Conf. on ASIC, pp. 551-554, 2001.

    J. Wakerly, 2nd, ed., Digital DesignPrinciples and Practices, Eaglewood Cliffs, NJ:Prentice Hall, 1994.

    J. Kuo and J. Lou, Low-Voltage CMOS VLSI Circuits, New York, NY: John Wiley & Sons,Inc., 1999.

    K. Pekmestzi, Multiplexer-Based Array Multipliers, IEEE Trans. on Computers, vol. 48,pp. 15-23, 1999.

    A. Fayed, M. Bayoumi, A Novel Architecture for Low-Power Design of ParallelMultipliers, Proc. 2001 IEEE Computer Society Workshop on VLSI, pp. 149-154, 2001.

    Y. Wang, Y. Jiang, E. Sha, On Area-Efficient Low Power Array Multipliers, Proc. 2001IEEE Intl Conf. On Electronics, Circuits and Systems, vol. 3, pp. 1429-1432, 2001.