Top Banner

of 63

ch4

Jan 14, 2016

Download

Documents

Upinder Kaur

c
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • COMPUTER ARCHITECTURECS 45101CS 55101

    CHAPTER 4

    Arithmetic for Computers

    H. Peyravi

    Department of Computer Science

    Kent State University

    Fall 2001

    Computer Architecture Fall 2001

  • Contents

    Introduction MIPS: 32 bit Signed Numbers Arithmetic Operations Boolean Algebra & Gates Constructing an ALU A 32-bit ALU Addition Multiplication Floating Point Representation IEEE 754 Floating-point Standard

    Computer Architecture Fall 2001

  • Chapter 4: Arithmetic for Computers Introduction

    Introduction

    Goals

    To introduce 2s complement numbers and their addition and

    subtraction

    To introduce basic logic operations (AND, OR, logical shift), and

    hardware building blocks (AND, OR, NOT gates and multiplexor)

    To explain the construction and operation of a 32-bit ALU that

    performs AND, OR, add, and slt

    To show algorithms that perform multiplication and division and

    hardware that implements these algorithms

    To demonstrate floating point representation and arithmetic

    operations

    To introduce MIPS instructions for unsigned operands, for logic

    operations, and for floating point numbers

    Numbers

    In binary system how do we represent negative numbers?

    What is the largest or smallest number that can fit into a word?

    What would happen to overflow and underflow case?

    How fractions and real numbers are represented?

    Computer Architecture Fall 2001 1 of 61

  • Chapter 4: Arithmetic for Computers Introduction

    Possible Representation

    Sign Magnitude Ones Complement Twos Complement

    000 = +0 000 = +0 000 = +0

    001 = +1 001 = +1 001 = +1

    010 = +2 010 = +2 010 = +2

    011 = +3 011 = +3 011 = +3

    100 = -0 100 = -3 100 = -4

    101 = -1 101 = -2 101 = -3

    110 = -2 110 = -1 110 = -2

    111 = -3 111 = -0 111 = -1

    Issues:

    Balanced ?

    Number of zeros ?

    Ease of Operations ?

    Which one is best ?

    Computer Architecture Fall 2001 2 of 61

  • Chapter 4: Arithmetic for Computers MIPS: 32 bit Signed Numbers

    MIPS: 32 bit Signed Numbers

    Negative Numbers

    Numbers can be represented in any base (human10, and

    machines2)

    The value of the ith digit d in a number with base b

    d bi

    The bits are numbered from right to left

    MIPS word is 32 bits long, starting from 0 to

    232 1 = 4, 294, 967, 29510

    Although 32 is an even number, but is not possible to divide it

    into equal number of positive and negation numbers, since 0 needs

    to be represented as well

    The alternative is to have unbalanced positive and negative

    numbers

    Positive numbers: 0 to 231 1 = 2, 147, 483, 64710

    Negative numbers: 231 = 2, 147, 483, 64810 to1 = 232 1

    There is one negative number, 231 = 2, 147, 483, 64810 thathas no corresponding positive positions

    Computer Architecture Fall 2001 3 of 61

  • Chapter 4: Arithmetic for Computers MIPS: 32 bit Signed Numbers

    0000 0000 0000 0000 0000 0000 0000 0000 = 0100000 0000 0000 0000 0000 0000 0000 0001 = 1100000 0000 0000 0000 0000 0000 0000 0010 = 210 0111 1111 1111 1111 1111 1111 1111 1101 = 2, 147, 483, 645100111 1111 1111 1111 1111 1111 1111 1110 = 2, 147, 483, 646100111 1111 1111 1111 1111 1111 1111 1111 = 2, 147, 483, 64710 Max int1000 0000 0000 0000 0000 0000 0000 0000 = 2, 147, 483, 64810 Min int1000 0000 0000 0000 0000 0000 0000 0001 = 2, 147, 483, 647101000 0000 0000 0000 0000 0000 0000 0010 = 2, 147, 483, 64610 1111 1111 1111 1111 1111 1111 1111 1101 = 3101111 1111 1111 1111 1111 1111 1111 1110 = 2101111 1111 1111 1111 1111 1111 1111 1111 = 110

    This convention is called 2s complement, it insures that

    x + (x) = 0

    0 has one representative

    Hardware test of a negative number requires only checking the

    most significant bit (sign bit, hence hardware simplification

    Further, address locations are all positive, and the programmer

    can convert a negative number to an unsigned integer (as in C)

    A positive or negative number can be described as

    (x31 231) + (x30 230) + + (x1 21) + (x0 20)

    In comparing a negative integer with a positive integer, the MSB

    of the negative integer is greater than the MSB of the positive

    integer, which is not the case for unsigned integers

    Computer Architecture Fall 2001 4 of 61

  • Chapter 4: Arithmetic for Computers MIPS: 32 bit Signed Numbers

    Twos Complement Operations

    In order to negate a number, complement each bit and then add 1

    to the result. For example

    210 = 0000 0000 0000 0000 0000 0000 0000 00102210 = 1111 1111 1111 1111 1111 1111 1111 11012 + 1

    = 1111 1111 1111 1111 1111 1111 1111 11102

    Converting n bit numbers to m bit numbers, where, m > n

    Example sign extention:

    MIPS 16 bit immediate gets converted to 32 bits

    Copy the MSB (sign bit) into the other bits

    0010 000000101010 11111010

    Example:

    lb = load byte lbu = load byte unsigned

    Computer Architecture Fall 2001 5 of 61

  • Chapter 4: Arithmetic for Computers Arithmetic Operations

    Arithmetic Operations

    Addition

    Except the right most digits, we have a carry (0 or 1) for the

    subsequent digits and hence 3 operands

    0 1 0 0 1 0 1 1

    1 1 1 1 0 0 0 1

    (1) 0 (1) 0 (0)1 (0)1 (0)1 (0)1 (1)0 (1) 0

    (1) (0) (0) (0) (0) (1) (1) Carries

    The carry bit beyond the end of word is ignored.1001 = -7 1100 = -4 0011 = +30101 = +5 0100 = +4 0100 = +41110 = -2 0000 = 0 0111 = +7

    The carry bit beyond the end of word is ignored1100 = -4 0101 = +5 1001 = -71111 = -1 0100 = +4 1010 = -61011 = -5 1001 = OF 0011 = OF

    How do we detect overflow?

    Overflow occurs iff the result has the opposite sign

    Computer Architecture Fall 2001 6 of 61

  • Chapter 4: Arithmetic for Computers Arithmetic Operations

    Subtraction

    To subtract one number(subtrahend) from another (minuend),

    take 2s complement of the subtrahend and add it to the minuend2 7 5 ( 2) 5 2

    0010 = + 2 0101 = + 5 0101 = + 51001 = 7 0010 = + 2 1110 = 21011 = 5 0111 = + 7 0011 = + 3

    7 ( 7) 5 2 6 40111 = + 7 1011 = 5 1010 = 60111 = + 7 1110 = 2 1100 = 41110 = OF 1001 = 7 0110 = OF

    How to Detect Overflow?

    No overflow when adding a positive and a negative number

    No overflow when the signs are the same for subtraction

    Overflow occurs when:

    adding two positives yields a negative, or

    adding two negatives yields a positive, or

    subtract a negative from a positive and get a negative, or

    subtract a positive from a negative and get a positive

    Computer Architecture Fall 2001 7 of 61

  • Chapter 4: Arithmetic for Computers Arithmetic Operations

    Effects of Overflow

    An exception (interrupt) occurs

    Control jumps to predefined address for exception

    Address of the offending instruction is saved for possible

    resumption

    MIPS saves the address of the instruction that overflowed in a

    register called Exception Program Counter, EPC, and an interrupt

    handler routine is called

    Details based on software system/language

    flight control vs. homework assignment

    MIPS uses register $k0 and $k1 to save the return address

    from EPC

    Unsigned integers could cause overflow, but they are not

    considered overflow since they are commonly used for memory

    address

    Note that overflow term is somehow misleading, it doesnt mean a

    a carry overflowed

    In MIPS

    add, addi and sub cause exceptions for overflow

    addu, addiu and subu do not cause exceptions for overflow

    Computer Architecture Fall 2001 8 of 61

  • Chapter 4: Arithmetic for Computers Arithmetic Operations

    Supporting slt

    MIPS offers two sets of set-on-less-than

    For signed intergers

    Set-on-less-than, slt

    Set-on-less-than-immediate slti

    For unsigned intergers

    Set-on-less-than-unsigned, slu

    Set-on-less-than-immediate-unsigned sltiu

    Example: Suppose

    $s0 = 1111 1111 1111 1111 1111 1111 1111 1111 = 1signed2= 4, 294, 967, 295unsigned10

    $s1 = 0000 0000 0000 0000 0000 0000 0000 0001 = 1signed2= 1unsigned10

    What would the following do

    slt $t0 $s0, $s1 # singend comparison

    sltu $t1 $s0, $s1 # unsigned comparison

    Register $t0 has value 1, since 110 < 110 andregister $t1 has value 0, since 4, 294, 967, 29510 > 110

    Computer Architecture Fall 2001 9 of 61

  • Chapter 4: Arithmetic for Computers Boolean Algebra & Gates

    Boolean Algebra & Gates

    Logical Operations

    Bytes can be supported using shift operations

    MIPS provides:

    shift left logical = sll, and

    shift right logical = srl operations

    sll $t2, $s0, 8 # $t2= $s0 srl

    AND & and, andi

    OR | or, ori

    Computer Architecture Fall 2001 10 of 61

  • Chapter 4: Arithmetic for Computers Constructing an ALU

    Constructing an ALU

    ALUs are constructed from four hardware building blocks, AND,

    OR, Inverter, and Multiplexor(MUX)

    c = a bba000010001111

    b

    ac

    b

    ac

    a c

    c = a + bba

    000110101111

    1001

    c = aa

    a0b1

    cd

    0

    1

    a

    c

    b

    d

    1. AND gate (c = a b)

    2. OR gate (c = a + b)

    3. Inverter (c = a)

    4. Multiplexor (if d = = 0, c = a; else c = b)

    MIPS has a 32-bit wide ALU

    Computer Architecture Fall 2001 11 of 61

  • Chapter 4: Arithmetic for Computers Constructing an ALU

    A 1-bit ALU

    A 1-bit logical unit can be constructed by an AND, and OR and a

    MUX

    b

    0

    1

    Result

    Operation

    a

    A 1-bit half adder consists of two input operands, the sum, and

    one output carryOut

    Inputs Outputs SOP

    a b CarryOut Sum CarryOut Sum

    0 0 0 0 a+ b a+ b

    0 1 0 1 a+ b ab

    1 0 0 1 a+ b ab

    1 1 1 0 ab a+ b

    SumSOP = ab + ab

    CSOP = ab

    Computer Architecture Fall 2001 12 of 61

  • Chapter 4: Arithmetic for Computers Constructing an ALU

    A 1-bit full adder consists of two input operands, one input

    carrayIn, the sum, and one output carryOut

    Sum

    CarryIn

    CarryOut

    a

    b

    Inputs Outputs Outputsa b CarryIn CarryOut Sum Comments Co Sum

    0 0 0 0 0 0+0+0=00

    0 0 1 0 1 0+0+1=01 abci0 1 0 0 1 0+1+0=01 abci0 1 1 1 0 0+1+1=10 abci1 0 0 0 1 1+0+0=01 abci1 0 1 1 0 1+0+1=10 abci1 1 0 1 0 1+1+0=10 abci1 1 1 1 1 1+1+1=11 abc abc

    The truth table can be expressed as SOP

    SumSOP = abci + abci + abci + abc

    CSOP = abci + abci + abci + abciCSOP = abci + abci + abci + abci + abci + abciCSOP = ab + aci + bci

    There are many ways to make adders using different boolean

    expressions

    Computer Architecture Fall 2001 13 of 61

  • Chapter 4: Arithmetic for Computers Constructing an ALU

    Carry out gates

    b

    CarryOut

    a

    CarryIn

    Sum can be integrated to this circuit

    A 1-bit ALU that performs and, or, and add operation

    b

    0

    2

    Result

    Operation

    a

    1

    CarryIn

    CarryOut

    Computer Architecture Fall 2001 14 of 61

  • Chapter 4: Arithmetic for Computers Constructing an ALU

    Subtraction

    Subtractions can be performed using 2s complement and a 2:1

    Mux

    To compute 2s complement the carrayIn of the LSB is set to 1

    A 1-bit ALU that performs and, or, addition and subtraction

    0

    2

    Result

    Operation

    a

    1

    CarryIn

    CarryOut

    0

    1

    Binvert

    b

    Computer Architecture Fall 2001 15 of 61

  • Chapter 4: Arithmetic for Computers Constructing an ALU

    A 32-bit Adder

    A 32-bit adder can be constructed by replication of 32 full adders,

    (ripple carry adder)

    Result31a31

    b31

    Result0

    CarryIn

    a0

    b0

    Result1a1

    b1

    Result2a2

    b2

    Operation

    ALU0

    CarryIn

    CarryOut

    ALU1

    CarryIn

    CarryOut

    ALU2

    CarryIn

    CarryOut

    ALU31

    CarryIn

    Computer Architecture Fall 2001 16 of 61

  • Chapter 4: Arithmetic for Computers A 32-bit ALU

    A 32-bit ALU

    Tailoring the ALU to the MIPS

    Need to support the set-on-less-than instruction (slt)

    Remember: slt is an arithmetic instruction

    It produces a 1 if rs < rt = (rs rt < 0) and 0 otherwise

    Use subtraction: (a b) < 0 = a < b

    Need to support test for equality (beq $t5, $t6, $t7)

    Use subtraction: (a b) = 0 = a = b, page 18 or Fig 4.17

    A 32-bit ALU constructed from 31 copies of 1-bit ALU of type a,

    page 18 or Fig. 4.17, and one copy of 1-bit ALU of type b

    (overflow detection)

    The sign bit of the operation rs rt can be ported to the leastsignificant bit (if there is no overflow), page 19 or Fig. 4.18

    When overflow, the sign bit is wrong and we need the inverse of

    the sign bit

    To accommodate conditional branch beq and bne, we can test

    the result of rs rt, if 0 then beq, otherwise bne. The simpleway is to NOR all outputs together, page 20

    Computer Architecture Fall 2001 17 of 61

  • Chapter 4: Arithmetic for Computers A 32-bit ALU

    Computer Architecture Fall 2001 18 of 61

  • Chapter 4: Arithmetic for Computers A 32-bit ALU

    Seta31

    0

    ALU0 Result0

    CarryIn

    a0

    Result1a1

    0

    Result2a2

    0

    Operation

    b31

    b0

    b1

    b2

    Result31

    Overflow

    Binvert

    CarryIn

    Less

    CarryIn

    CarryOut

    ALU1Less

    CarryIn

    CarryOut

    ALU2Less

    CarryIn

    CarryOut

    ALU31Less

    CarryIn

    Computer Architecture Fall 2001 19 of 61

  • Chapter 4: Arithmetic for Computers A 32-bit ALU

    Seta31

    0

    Result0a0

    Result1a1

    0

    Result2a2

    0

    Operation

    b31

    b0

    b1

    b2

    Result31

    Overflow

    Bnegate

    Zero

    ALU0Less

    CarryIn

    CarryOut

    ALU1Less

    CarryIn

    CarryOut

    ALU2Less

    CarryIn

    CarryOut

    ALU31Less

    CarryIn

    ALU ResultZero

    Overflow

    a

    b

    ALU operation

    CarryOut

    Computer Architecture Fall 2001 20 of 61

  • Chapter 4: Arithmetic for Computers A 32-bit ALU

    Notice control lines:

    000 = and

    001 = or

    010 = add

    110 = subtract

    111 = slt

    Note: zero is a 1 when the result is zero

    A Few Conclusions

    We can build an ALU that supports basic MIPS instruction

    Key ideas

    Use multiplexor to select the output we want

    Use twos complement to perform subtraction efficiently

    Replicate a 1-bit ALU to produce a 32-bit ALU

    Important points about hardware

    All of the gates are always working

    The speed of a gate is affected by the number of inputs to the

    gate

    The speed of a circuit is affected by the number of gates in

    series (critical path)

    Our focus: comprehension, however,

    Clever changes to organization can improve performance

    Similar to using better algorithms in software

    We will look at two examples for addition and multiplications

    in the next section

    Computer Architecture Fall 2001 21 of 61

  • Chapter 4: Arithmetic for Computers Addition

    Addition

    Problem

    Ripple carry adder is too slow

    Is a 32-bit ALU as fast as a 1-bit ALU?

    Is there more than one way to do addition?

    Two extremes:

    Ripple carry

    Not feasible! why?

    SOP using infinite hardware

    Not feasible! why?

    Carry-lookahead is an approach between two extremes

    Carry Lookahead

    Motivation:

    If we didnt know the value of CarryIn, what could we do?

    When would we always generate a carry? gi = aibi

    When would we propagate the carry? pi = ai + bi

    c1 = g0 + p0c0c2 = g1 + p1c1c3 = g2 + p2c2c4 = g3 + p3c3

    Computer Architecture Fall 2001 22 of 61

  • Chapter 4: Arithmetic for Computers Addition

    Did we get rid of the ripple?

    Is it feasible! Why?

    Several parallel approaches have been used to speed up the ALU

    as a function of log2 32, with more gate complexity

    Carry lookahead employs special carry circuits to generate each ci(CarryIn for the ith bit) rapidly.

    c1 = b0c0 + a0c0 + a0b0.

    c2 = b1c1 + a1c1 + a1b1substituting c2 by c1,

    c2 =

    (a1a0b0)+(a1a0c0)+(a1b0c0)+(b1a0b0)+(b1a0c0)+(b1a0c0)+(a1b1).

    The complexity grows as we compute ci for higher bits

    We can limit the hardware complexity, yet making substantial

    improvement

    Define generate gi = aibi

    Define propagate pi = ai + bi

    ci = 1 if

    {gi1 = 1 orpi1 = ci1 = 1

    pi and gi can be used to express cis

    For a 4-bit ALU

    c1 = g0 + p0c0c2 = g1 + p1g0 + p1p0c0c3 = g2 + p2g1 + p2p1g0 + p2p1p0c0c4 = g3 + p3g2 + p3p2g1 + p3p2p1g0 + p3p2p1p0c0

    Computer Architecture Fall 2001 23 of 61

  • Chapter 4: Arithmetic for Computers Addition

    This means that ci = 1 if some earlier adder generates a carry

    and all intermediary adders propagate a carry

    Using the Principle to Build Bigger Adders

    CarryIn

    Result0--3

    ALU0

    CarryIn

    Result4--7

    ALU1

    CarryIn

    Result8--11

    ALU2

    CarryIn

    CarryOut

    Result12--15

    ALU3

    CarryIn

    C1

    C2

    C3

    C4

    P0G0

    P1G1

    P2G2

    P3G3

    pigi

    pi + 1gi + 1

    ci + 1

    ci + 2

    ci + 3

    ci + 4

    pi + 2gi + 2

    pi + 3gi + 3

    a0b0a1b1a2b2a3b3

    a4b4a5b5a6b6a7b7

    a8b8a9b9

    a10b10a11b11

    a12b12a13b13a14b14a15b15

    Carry-lookahead unit

    Computer Architecture Fall 2001 24 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Multiplication

    Multiplication is performed as the sum of partial products

    The partial product is 0 when the multiplier bit is 0, and the

    partial product is the multiplicand when multiplier bit is 1

    The total product is the sum of left shifted of partial products

    The multiplication of two n-bit binary integer results in a product

    of up to 2n bit is length

    1011

    x 1101

    -------

    1011

    0000

    1011

    1011

    ------------

    10001111

    The operation can be made more efficient by

    Performing addition on the partial products rather than waiting

    until the end (saving storage)

    No need for addition when the multiplier is 0, but shift is

    required for both

    Computer Architecture Fall 2001 25 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Multiplication Implementation: First Version

    32-bit multiplier, 64-bit multiplicand, 64-bit ALU, and 64-bit

    product

    64-bit ALU

    Control test

    MultiplierShift right

    ProductWrite

    MultiplicandShift left

    64 bits

    64 bits

    32 bits

    Computer Architecture Fall 2001 26 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Done

    1. TestMultiplier0

    1a. Add multiplicand to product andplace the result in Product register

    2. Shift the Multiplicand register left 1 bit

    3. Shift the Multiplier register right 1 bit

    32nd repetition?

    Start

    Multiplier0 = 0Multiplier0 = 1

    No: < 32 repetitions

    Yes: 32 repetitions

    Second Version

    In average half of the bits of the multiplicand are 0, therefore

    64-bit ALU seemed wasteful

    Instead of shifting the multiplicand left, we shift the product to

    the right, after adding multiplicand to the left half of the product

    The algorithm uses 32-bit multiplicand, 32-bit multiplier, 32-bit

    adder, and 64-bit product

    Computer Architecture Fall 2001 27 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    MultiplierShift right

    Write

    32 bits

    64 bits

    32 bits

    Shift right

    Multiplicand

    32-bit ALU

    Product Control test

    Done

    1. TestMultiplier0

    1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register

    2. Shift the Product register right 1 bit

    3. Shift the Multiplier register right 1 bit

    32nd repetition?

    Start

    Multiplier0 = 0Multiplier0 = 1

    No: < 32 repetitions

    Yes: 32 repetitions

    Computer Architecture Fall 2001 28 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Third Version

    The right half of the product can be used to store the multiplier

    the procedure, and the example, respectively

    ControltestWrite

    32 bits

    64 bits

    Shift rightProduct

    Multiplicand

    32-bit ALU

    Computer Architecture Fall 2001 29 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Done

    1. TestProduct0

    1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register

    2. Shift the Product register right 1 bit

    32nd repetition?

    Start

    Product0 = 0Product0 = 1

    No: < 32 repetitions

    Yes: 32 repetitions

    The above algorithms will not work if both numbers are not

    positive1001 9 1001 -7

    x 0011 3 x 0011 300001001 1111100100010010 1111001000011011 27 11101011 -21Unsigned 2s complement

    Computer Architecture Fall 2001 30 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Signed Multiplication

    Consider the following of two unsigned integers yielding an 8-bit

    result

    1011x 1101

    00001011 1011 1 2000000000 1011 0 2100101100 1011 1 2201011000 1011 1 2310001111 143

    Each partial product (other than that for 20) consists of the 4-bit

    multiplicand (1011) shifted to the left with the unoccupied

    positions on the right filled with 0s

    However, this straightforward multiplication will not work if the

    multiplicand is negative

    Each contribution of the negative multiplicand as a partial

    product must be a negative number on a 2n-bit (64-bit) field

    The sign bits of the partial products must line up

    Unsigned Integers 2s Complement Integers1001 9 1001 7

    x 1101 3 x 0011 300001001 1001 1 20 11111001 7 = 1001 1 20 = 700010010 1001 1 21 11110010 7 = 1001 1 21 = 1400100100 1001 0 22 00000000 7 = 1001 0 22 = 001001000 1001 0 23 00000000 7 = 1001 0 23 = 000011011 27 21

    Computer Architecture Fall 2001 31 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    If the multiplier is negative, straightforward multiplication will not

    work. The reason is that the bits of the multiplier no longer

    correspond to the shifts or multiplications that must take place.

    There are a number of ways out of this dilemma:

    Convert both multiplier and multiplicand to positive numbers,

    perform multiplication, and the take the 2s complement of the

    result if-and-only-if the sign of the two original numbers differed

    Using Booths algorithm = no 2s complement conversion= faster

    Booths Algorithm

    First Consider a positive multiplier consisting of one block of 1ssurrounded by 0s

    M (00011110) = M (24 + 23 + 22 + 21)= M (16 + 8 + 4 + 2)= M 30

    The number of such operations can be reduced to two if we

    observed that

    2n + 2n1 + + 2nk = 2n+1 2nk (1)

    Therefore,M (00011110) = M (25 21)

    = M (32 2)= M 30

    So, the product can be generated by one addition and one

    subtraction of the multiplicand

    Computer Architecture Fall 2001 32 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    This scheme can be extended to any number of blocks of 1s in amultiplier, including a block of a single 1. Thus

    M (01111010) = M (26 + 25 + 24 + 23 + 21)= M (27 23 + 22 21)= M 122

    Therefore, subtraction is performed when the first 1 of the block is

    encountered (1-0) and, addition is performed when the end of the

    block is encountered (0-1)

    As an example M (6)6 = 11111010 = 27 + 26 + 25 + 24 + 23 + 21

    M (11111010) = M (27 + 26 + 25 + 24 + 23 + 21)

    It performs a subtraction when the first 1 is encountered (1-0)

    It performs a addition when the first 1 is encountered (0-1)

    It performs a subtraction when the first 1 of the next block of

    1s is encountered (1-0)

    Booths algorithm is based on the fact that a string of 0s avoids

    arithmetic (only shift)

    Otherwise, the string can be classified into

    Beginning of a run of 1s

    Middle of a run of 1s

    End of a run of 1s

    Computer Architecture Fall 2001 33 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    The Algorithm:

    Trace 2 bits of the multiplier at a time, and if it is

    00 Middle of a string of 0, no arithmetic operation.01 End of a string of 1, add the multiplicand to the left half of

    the product.10 Beginning of a string of 1, subtract the multiplicand from the

    left half of the product.11 Middle of a string of 1, no arithmetic operation.

    Computer Architecture Fall 2001 34 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    There is a 1-bit register placed logically to the right of the LSB of

    the product

    Example: 0111 0011 (7 3). Multiplicand = 0111.Product0000 0011 0 Initial value, extra bit is padded

    (10) sub 1001 0011 0shift 1100 1001 1(11) shift 1110 0100 1(01) add 0101 0100 1(01) shift 0010 1010 0(00) shift 0001 0101 0 = 21

    Example: 0111 1101, 73. Multiplicand = 0111.Product0000 1101 0 Initial value, extra bit is padded

    (10) sub 1001 1101 0shift 1100 1110 1(01) add 0011 1110 1shift 0001 1111 0(10) sub 1010 1111 0shift 1101 0111 1(11) shift 1110 1011 1 = - 21

    Example: 0010 0110, 2 6. Multiplicand = 0010.Product0000 0110 0 Initial value, extra bit is padded

    (00) shift 0000 0011 0(10) sub 1110 0011 0shift 1111 0001 1(11) shift 1111 1000 1(01) add 0001 1000 1shift 0000 1100 0 = 12

    Computer Architecture Fall 2001 35 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Example: 1001 1101, 7. Multiplicand = 1001Product0000 1101 0 Initial value, extra bit is padded

    (10) sub 0111 1101 0shift 0011 1110 1(01) add 1100 1110 1shift 1110 0111 0(10) sub 0101 0111 0shift 0010 1011 1(11) shift 0001 0101 1 = 21

    Computer Architecture Fall 2001 36 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Division

    It is somewhat more complex than multiplication

    Procedure:

    1. The bits of the dividend are examined from left to right,

    until the set of bits examined represents a number divisor.until this event occur, 0s are placed in the quotient from left

    to right

    2. When the event occurs, a 1 is placed in the quotient and a

    divisor is subtracted from the partial dividend

    3. Append additional bits from the dividend (if any) to the new

    partial remainder. If the result divisor, go to 1

    An example of long division of unsigned binary integers

    0 0 0 0 1 1 0 1 1 0 1 1 )1 0 0 1 0 0 1 1 0 0 1 1 1 0 | |

    1 0 1 1 | |

    ___________ v v

    Partial Reminder ---> 0 0 1 1 1 1

    1 0 1 1

    ___________

    1 0 0

  • Chapter 4: Arithmetic for Computers Multiplication

    The size of the dividend is limited by the sum of the sizes of the

    divisor and quotient

    First Version

    A 32-bit Quotient register, initially set to 0

    The divisor is placed in the left half of the 64-bit Divisor register

    At each step, the Divisor register is shifted to the right (1 bit) to

    be aligned with the Dividend.

    64-bit ALU

    Controltest

    QuotientShift left

    RemainderWrite

    DivisorShift right

    64 bits

    64 bits

    32 bits

    Computer Architecture Fall 2001 38 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Done

    Test Remainder

    2a. Shift the Quotient register to the left,setting the new rightmost bit to 1

    3. Shift the Divisor register right 1 bit

    33rd repetition?

    Start

    Remainder < 0

    No: < 33 repetitions

    Yes: 33 repetitions

    2b. Restore the original value by addingthe Divisor register to the Remainder

    register and place the sum in theRemainder register. Also shift the

    Quotient register to the left, setting thenew least significant bit to 0

    1. Subtract the Divisor register from theRemainder register and place the result in the Remainder register

    Remainder > 0

    Computer Architecture Fall 2001 39 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Second Version

    By switching the order of the operation to shift and then subtract,

    One iteration of the algorithm can be removed

    The ALU and Divisor register can be cut in half

    Shifting the remainder to the left instead of shifting the divisor to

    the right produces the same alignment

    Controltest

    QuotientShift left

    Write

    32 bits

    64 bits

    32 bits

    Shift left

    Divisor

    32-bit ALU

    Remainder

    Computer Architecture Fall 2001 40 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Third Version

    The Quotient register could eliminated by shifting the bits of the

    quotient into the Reminder instead of shifting in 0s

    Both the remainder and the quotient will be shifted when a shift is

    made on the the remainder register,

    This causes the remainder to be shifted an extra position to the

    left and correction is required,

    Write

    32 bits

    64 bits

    Shift leftShift right

    Remainder

    32-bit ALU

    Divisor

    Controltest

    Computer Architecture Fall 2001 41 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Done. Shift left half of Remainder right 1 bit

    Test Remainder

    3a. Shift the Remainder register to the left, setting the new rightmost bit to 1

    32nd repetition?

    Start

    Remainder < 0

    No: < 32 repetitions

    Yes: 32 repetitions

    3b. Restore the original value by addingthe Divisor register to the left half of theRemainder register and place the sum

    in the left half of the Remainder register.Also shift the Remainder register to theleft, setting the new rightmost bit to 0

    2. Subtract the Divisor register from theleft half of the Remainder register andplace the result in the left half of the

    Remainder register

    Remainder 0

    1. Shift the Remainder register left 1 bit

    >

    Computer Architecture Fall 2001 42 of 61

  • Chapter 4: Arithmetic for Computers Multiplication

    Signed Division

    For sign numbers, the sign of the divisor and dividend should be

    saved

    Sign of Quotient = Sign of Dividend Sign of Divisor

    For a nonzero remainder,

    Sign of Remainder = Sign of Dividend

    The same hardware can be used for both multiply and divide using

    a 64-bit shift register for product (in multiply) or remainder (in

    divide)

    MIPS has two multiply instructions: multiply(mult) and

    multiply unsigned(multu)

    MIPS has two divide instructions: divide(dvi) and

    divide unsigned(dviu)

    To fetch the 32-bit product (remainder), MIPS has

    move from lo(mflo) instruction

    To avoid overflow, Hi must be 0 for multu or must be the

    replicated sign of Lo for mult

    The instruction move from hi transfers Hi to a register to test

    for overflow

    MIPS divide instructions ignore overflow

    Both overflow and dived-by-0 are handled by MIPS software

    Computer Architecture Fall 2001 43 of 61

  • Chapter 4: Arithmetic for Computers Floating Point Representation

    Floating Point Representation

    With a fixed-point notation (2s complement) it is possible to

    represent a range of positive and negative integers centered on 0

    By assuming a fixed point binary or radix point, this format allows

    the representation of numbers with fractional component as well

    This approach has some limitations

    Very large numbers or very small fractions can not be

    represented

    The fractional part of the quotient in a division of two large

    numbers could be lost

    Using scientific notations, we get around this limitation

    976, 000, 000, 000, 000 = 9.76 10140.0000000000000976 = 9.76 1014

    Dynamically slide the decimal point to a convenient location and

    use the exponent of 10 to keep track of the decimal point

    A range of very large and very small numbers can be represented

    with only few digits

    This approach can be used with binary (or any other base)

    The number S BE can be stored in a binary word with treefields:

    Sign Biased Exponent Significand(or mantissa)1-bit 8-bits 23-bits

    32-bit floating-point format

    Computer Architecture Fall 2001 44 of 61

  • Chapter 4: Arithmetic for Computers Floating Point Representation

    There must be a compromise between the size of the significand

    and the size of the exponent

    In this case, the 8-bit field yields the number 0 through 255

    A fixed value, called the bias, subtracted from the field to get the

    true exponent value

    With a bias of 128, the true exponent values are in the range -128

    to +127, with 0 typically 10000000

    Therefore any floating point number can be expressed in many

    ways

    To simplify operations on floating-point numbers, it is typically

    required that they be normalized

    A normalized number is one in the form

    0.1bbb b 2Ewhere b is either 0 or 1

    Since the MSB of the significand field is always 1, there is no need

    to store it. Thus the 23-bit field is used to store a 24-bit

    significand with values between 0.5 and 1.0

    Examples

    0.11010001 210100 = 0 10010100 101000100000000000000000.11010001 210100 = 1 10010100 10100010000000000000000

    0.11010001 210100 = 0 01101100 101000100000000000000000.11010001 210100 = 1 01101100 10100010000000000000000

    Note that the value 128 is added to the true exponent to be

    stored in the exponent field

    (1 224) 2127 Negative Numbers 0.5 2128

    Computer Architecture Fall 2001 45 of 61

  • Chapter 4: Arithmetic for Computers Floating Point Representation

    0.5 2128 Positive Numbers (1 224) 2127

    Five regions on the number line are not included in these ranges

    Negative overflow: negative numbers < (1 224) 2127

    Negative underflow: negative numbers > 0.5 2128

    Zero

    Positive underflow: positive numbers < 0.5 2128

    Positive overflow: positive numbers > (1 224) 2127

    There is a trade-off between accuracy and range

    Actual floating-point representations include a special bit pattern

    to designate zero

    To reduce the chances of underflow or overflow, most

    programming languages offer double precision floating-point

    operations, by expanding the exponent field

    MIPS provides an extraordinary range

    2.0 1038 Single precision fraction 2.0 238

    2.0 10308 Double precision fraction 2.0 2308

    Computer Architecture Fall 2001 46 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    IEEE 754 Floating-point Standard

    Since 1980, IEEE 754 floating point standard has been

    implemented in virtually every computer

    IEEE 754 standard recommends

    Significand of 24 (23+ 1 implicit) bits for single precision

    Significand of 53 (52+ 1 implicit) bits for double precision

    Since 0 has no leading 1, it is given the reserved exponent value

    0 and the hardware wont attach a leading 1 to the significand

    A bias of 127. This means that the value is

    (1)S (1 + significand) 2exponentbias

    where S is the sign bit

    Sign Biased Exponent Significand(or mantissa)1-bit 8-bits 23-bits

    IEEE single precision floating-point format

    Sign Biased Exponent Significand(or mantissa)1-bit 11-bits 52-bits

    IEEE double precision floating-point format

    Computer Architecture Fall 2001 47 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Leading 1 bit of significand is implicit

    The mantissa is represented as a 2s complement

    The exponent is represented in a biased (excess) form to make

    sorting easier

    All 0s is smallest exponent, all 1s is largest

    The range 0 to 2m 1 of m bits is relabeled2m1 to +2m1 1 by subtracting a constant value (bias) of2m1

    Bias of 127 for single precision and 1023 for double precision

    IEEE format labels the exponent differently 126 E +127 rather than 127 E +128 127 is used to encode 0

    +128 is used to encode or NaN (not a number)Binary True exponent Biased form

    0000 0000 127 0

    0000 0001 126 1

    0000 0010 125 2

    0111 1111 0 127

    1000 0000 1 128

    1111 1101 + 126 253

    1111 1101 + 127 254

    1111 1111 + 128 255

    Computer Architecture Fall 2001 48 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Word Size (bits) 32 64

    Exponent size 8 11Exponent base 2 2Max Exponent 127 1023Min Exponent -126 -1022Range (base 10) 1038 10+38 10308 10+308Significand size (bits) 23 52Number of exponents 254 2046Number of fractions 223 252

    Number of values 1.98 231 1.99 263

    Floating Point Operations

    Floating Point Numbers

    X = XsbXe

    Y = YsbYe

    Arithmetic Operations

    X + Y = (XSbXeYe + Ys) bYe

    X Y = (XSbXeYe + Ys) bYe}

    if Xe Ye

    X Y = (Xs Ys) bXe+YeX/Y = (Xs/Ys) bXeYe

    Computer Architecture Fall 2001 49 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Floating Point Complexities

    Operations are somewhat more complicated

    In addition to overflow, we have underflow

    Accuracy can be a big problem

    IEEE 754 keeps two extra bits, guard and round

    Four rounding modes

    Positive divided by zero yields infinity

    Zero divided by zero yields not a number

    Other complexities

    Implementing the standard can be tricky

    Not using the standard can be even worse

    See text for description of 80x86 and Pentium Bug!

    Computer Architecture Fall 2001 50 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Guard Bits

    Loading mantissa or exponent into ALU registers results in some

    extra bits called guard bits

    They affect the precision of the result

    Consider X = 1.000 000 21 and Y = 1.111 111 20What is Z = X Y ?

    X Y Without Guard BitsX = 1.000 000 21Y = 0.111 111 21 Y loses 1 bit of significandZ = 0.000 001 21

    = 1.000 001 223

    X Y With Guard BitsX = 1.000 000 00000000 21Y = 0.111 111 10000000 21Z = 0.000 000 10000000 21

    = 1.000 000 00000000 224No bit lost, result differs by factor 2

    Computer Architecture Fall 2001 51 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Rounding

    The result of an ALU operation on significands is put back into a

    floating point format

    Extra bits must be disposed

    Some techniques:

    Round to nearest representable number

    Round to +

    Round to

    Round to 0

    Computer Architecture Fall 2001 52 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Round to Nearest

    If the extra bits amount more than one-half of the last

    representable bit, add 1 to lo the LSB of the significand, otherwise

    drop them

    Extra bits 10010 add 1 to LSB of the significand

    Extra bits 01111 drop the extra bit

    If the extra bits amount exactly halfway between two representable

    values randomly round up or down to avoid cumulative bias

    In IEEE, if the result of a computation is exactly midway between

    two representable numbers:

    Round up if the last representable bit is 1

    Truncate if the last representable bit is 0

    Computer Architecture Fall 2001 53 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Examples

    157.3 + 257.1, assume 6-bit bias (excess 32) and 10-bit mantissa

    Convert to binary and normalize

    157.3 = 10011101.01001 = 0.1001110101001 28257.1 = 100000001.0001 = 0.1000000010001 29

    Round the mantissa to ten bits

    157.3 = 0.1001110101001 28 = 0.100111011 28257.1 = 0.1000000010001 29 = 0.100000001 29

    Make the smaller exponent equal to the larger exponent and add

    157.3 = 0.010011101 29257.1 = 0.100000001 29414.0 = 0.110011110 29

    The result is already normalized. The result differs from the exact

    answer (414.4) by 0.4

    Computer Architecture Fall 2001 54 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Floating-Point Arithmetic

    For addition and subtraction, it is necessary to ensure that both

    operands have the same exponent. This may require shifting the

    radix point on one of the operands to achieve alignment

    Problems may arise as the result of these operations

    Exponent overflow: A positive exponent exceeds the limit In

    some system this may be designated as + or

    Exponent underflow: A negative exponent exceeds the limit

    This means the number is too small and in some system this

    may be reported as 0

    Significand underflow: In the process of aligning significand,

    digits may flow off the right end of the significand

    Significand overflow: The addition of two significands of the

    same sign may result in a carry out of the most significand bit

    Computer Architecture Fall 2001 55 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Addition and Subtraction

    In floating-point arithmetic, addition and subtraction are more

    complex than multiplication and division. This is because of the

    need for alignment

    There are four basic phases of the algorithm for addition and

    subtraction. Example: 9.999 101 + 1.610 1011. Check for zeros

    2. Align the significand of the number with smaller exponent to

    match the larger exponent, 1.610 101 = 0.0161 1013. Add or subtract the significands, 9.999 + 0.016 = 10.015

    4. Normalize the result, 10.015 = 1.0015 1025. Round the result, 1.0015 102 = 1.002 102, Fig. 4.44

    Example:0.5 0.43751. 0.5 = 1.000120.4375 = 1.11022

    2. 1.00012 0.111123. 1.00012 0.11112 = 0.0012 214. 0.0012 21 = 1.0002 24

    Many machines have dedicated hardware to perform floating point

    operations as fast as possible, Fig. 4.45

    Computer Architecture Fall 2001 56 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Done

    2. Add the significands

    4. Round the significand to the appropriatenumber of bits

    Still normalized?

    Start

    Yes

    No

    No

    YesOverflow orunderflow?

    Exception

    3. Normalize the sum, either shifting right andincrementing the exponent or shifting left

    and decrementing the exponent

    1. Compare the exponents of the two numbers.Shift the smaller number to the right until itsexponent would match the larger exponent

    Computer Architecture Fall 2001 57 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    0 10 1 0 1

    Control

    Small ALU

    Big ALU

    Sign Exponent Significand Sign Exponent Significand

    Exponentdifference

    Shift right

    Shift left or right

    Rounding hardware

    Sign Exponent Significand

    Increment ordecrement

    0 10 1

    Shift smallernumber right

    Compareexponents

    Add

    Normalize

    Round

    Computer Architecture Fall 2001 58 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Floating Point Multiplication and Division

    Floating-point multiplication and division are much simpler than

    addition and subtraction

    1. Check for either operands to be 0

    2. Add the exponents. If the exponents are stored in biased

    form, the exponent sum would have doubled the bias. Thus

    the bias must be subtracted from the sum

    3. The result could be either an exponent overflow or underflow,

    which would be reported

    4. If the exponent oft he product is within the proper range, the

    next step is to multiply the significands (taking into account

    their signs). The product will be double the length of the

    multiplier and multiplicand. The extra bits will be lost during

    rounding

    5. The result is then normalized and rounded. Normalization

    could result in exponent underflow

    Rounding requires extra bits for calculation. IEEE 754 keeps 2

    extra bits on the right during intermediate calculations, called

    guard andround bits

    Computer Architecture Fall 2001 59 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Computer Architecture Fall 2001 60 of 61

  • Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

    Example

    0.50.4375 = (1.0002 21) (1.110 22)1. New exponent= -1 + (-2) = -3. or using the biased

    representation: (-1+127)+(-2+127)-127= 124

    2. Multiplying the significand

    1.0002 1.110 = 11100002 = 1.1100002 23 = 1.1102 233. The product is already normalized

    4. No rounding is required,

    5. The sign of the result is negative

    MIPS supports IEEE single and double precision formats:

    add.s, sub.s, mul.s, and dvi.s, for single precision

    add.d, sub.d, mul.d, dvi.d, for double precision

    MIPS also supports comparison and branch instructions for single

    and double precision

    MIPS has floating point register to manipulate floating point

    numbers only 16 of the 32 MIPS floating-point registers can be

    used for single precision operations. Double precision is computed

    using pairs of these registers

    MIPS also has load and store instructions to move floating point

    numbers

    Computer Architecture Fall 2001 61 of 61