COMPUTER ARCHITECTURECS 45101CS 55101
CHAPTER 4
Arithmetic for Computers
H. Peyravi
Department of Computer Science
Kent State University
Fall 2001
Computer Architecture Fall 2001
Contents
Introduction MIPS: 32 bit Signed Numbers Arithmetic Operations Boolean Algebra & Gates Constructing an ALU A 32-bit ALU Addition Multiplication Floating Point Representation IEEE 754 Floating-point Standard
Computer Architecture Fall 2001
Chapter 4: Arithmetic for Computers Introduction
Introduction
Goals
To introduce 2s complement numbers and their addition and
subtraction
To introduce basic logic operations (AND, OR, logical shift), and
hardware building blocks (AND, OR, NOT gates and multiplexor)
To explain the construction and operation of a 32-bit ALU that
performs AND, OR, add, and slt
To show algorithms that perform multiplication and division and
hardware that implements these algorithms
To demonstrate floating point representation and arithmetic
operations
To introduce MIPS instructions for unsigned operands, for logic
operations, and for floating point numbers
Numbers
In binary system how do we represent negative numbers?
What is the largest or smallest number that can fit into a word?
What would happen to overflow and underflow case?
How fractions and real numbers are represented?
Computer Architecture Fall 2001 1 of 61
Chapter 4: Arithmetic for Computers Introduction
Possible Representation
Sign Magnitude Ones Complement Twos Complement
000 = +0 000 = +0 000 = +0
001 = +1 001 = +1 001 = +1
010 = +2 010 = +2 010 = +2
011 = +3 011 = +3 011 = +3
100 = -0 100 = -3 100 = -4
101 = -1 101 = -2 101 = -3
110 = -2 110 = -1 110 = -2
111 = -3 111 = -0 111 = -1
Issues:
Balanced ?
Number of zeros ?
Ease of Operations ?
Which one is best ?
Computer Architecture Fall 2001 2 of 61
Chapter 4: Arithmetic for Computers MIPS: 32 bit Signed Numbers
MIPS: 32 bit Signed Numbers
Negative Numbers
Numbers can be represented in any base (human10, and
machines2)
The value of the ith digit d in a number with base b
d bi
The bits are numbered from right to left
MIPS word is 32 bits long, starting from 0 to
232 1 = 4, 294, 967, 29510
Although 32 is an even number, but is not possible to divide it
into equal number of positive and negation numbers, since 0 needs
to be represented as well
The alternative is to have unbalanced positive and negative
numbers
Positive numbers: 0 to 231 1 = 2, 147, 483, 64710
Negative numbers: 231 = 2, 147, 483, 64810 to1 = 232 1
There is one negative number, 231 = 2, 147, 483, 64810 thathas no corresponding positive positions
Computer Architecture Fall 2001 3 of 61
Chapter 4: Arithmetic for Computers MIPS: 32 bit Signed Numbers
0000 0000 0000 0000 0000 0000 0000 0000 = 0100000 0000 0000 0000 0000 0000 0000 0001 = 1100000 0000 0000 0000 0000 0000 0000 0010 = 210 0111 1111 1111 1111 1111 1111 1111 1101 = 2, 147, 483, 645100111 1111 1111 1111 1111 1111 1111 1110 = 2, 147, 483, 646100111 1111 1111 1111 1111 1111 1111 1111 = 2, 147, 483, 64710 Max int1000 0000 0000 0000 0000 0000 0000 0000 = 2, 147, 483, 64810 Min int1000 0000 0000 0000 0000 0000 0000 0001 = 2, 147, 483, 647101000 0000 0000 0000 0000 0000 0000 0010 = 2, 147, 483, 64610 1111 1111 1111 1111 1111 1111 1111 1101 = 3101111 1111 1111 1111 1111 1111 1111 1110 = 2101111 1111 1111 1111 1111 1111 1111 1111 = 110
This convention is called 2s complement, it insures that
x + (x) = 0
0 has one representative
Hardware test of a negative number requires only checking the
most significant bit (sign bit, hence hardware simplification
Further, address locations are all positive, and the programmer
can convert a negative number to an unsigned integer (as in C)
A positive or negative number can be described as
(x31 231) + (x30 230) + + (x1 21) + (x0 20)
In comparing a negative integer with a positive integer, the MSB
of the negative integer is greater than the MSB of the positive
integer, which is not the case for unsigned integers
Computer Architecture Fall 2001 4 of 61
Chapter 4: Arithmetic for Computers MIPS: 32 bit Signed Numbers
Twos Complement Operations
In order to negate a number, complement each bit and then add 1
to the result. For example
210 = 0000 0000 0000 0000 0000 0000 0000 00102210 = 1111 1111 1111 1111 1111 1111 1111 11012 + 1
= 1111 1111 1111 1111 1111 1111 1111 11102
Converting n bit numbers to m bit numbers, where, m > n
Example sign extention:
MIPS 16 bit immediate gets converted to 32 bits
Copy the MSB (sign bit) into the other bits
0010 000000101010 11111010
Example:
lb = load byte lbu = load byte unsigned
Computer Architecture Fall 2001 5 of 61
Chapter 4: Arithmetic for Computers Arithmetic Operations
Arithmetic Operations
Addition
Except the right most digits, we have a carry (0 or 1) for the
subsequent digits and hence 3 operands
0 1 0 0 1 0 1 1
1 1 1 1 0 0 0 1
(1) 0 (1) 0 (0)1 (0)1 (0)1 (0)1 (1)0 (1) 0
(1) (0) (0) (0) (0) (1) (1) Carries
The carry bit beyond the end of word is ignored.1001 = -7 1100 = -4 0011 = +30101 = +5 0100 = +4 0100 = +41110 = -2 0000 = 0 0111 = +7
The carry bit beyond the end of word is ignored1100 = -4 0101 = +5 1001 = -71111 = -1 0100 = +4 1010 = -61011 = -5 1001 = OF 0011 = OF
How do we detect overflow?
Overflow occurs iff the result has the opposite sign
Computer Architecture Fall 2001 6 of 61
Chapter 4: Arithmetic for Computers Arithmetic Operations
Subtraction
To subtract one number(subtrahend) from another (minuend),
take 2s complement of the subtrahend and add it to the minuend2 7 5 ( 2) 5 2
0010 = + 2 0101 = + 5 0101 = + 51001 = 7 0010 = + 2 1110 = 21011 = 5 0111 = + 7 0011 = + 3
7 ( 7) 5 2 6 40111 = + 7 1011 = 5 1010 = 60111 = + 7 1110 = 2 1100 = 41110 = OF 1001 = 7 0110 = OF
How to Detect Overflow?
No overflow when adding a positive and a negative number
No overflow when the signs are the same for subtraction
Overflow occurs when:
adding two positives yields a negative, or
adding two negatives yields a positive, or
subtract a negative from a positive and get a negative, or
subtract a positive from a negative and get a positive
Computer Architecture Fall 2001 7 of 61
Chapter 4: Arithmetic for Computers Arithmetic Operations
Effects of Overflow
An exception (interrupt) occurs
Control jumps to predefined address for exception
Address of the offending instruction is saved for possible
resumption
MIPS saves the address of the instruction that overflowed in a
register called Exception Program Counter, EPC, and an interrupt
handler routine is called
Details based on software system/language
flight control vs. homework assignment
MIPS uses register $k0 and $k1 to save the return address
from EPC
Unsigned integers could cause overflow, but they are not
considered overflow since they are commonly used for memory
address
Note that overflow term is somehow misleading, it doesnt mean a
a carry overflowed
In MIPS
add, addi and sub cause exceptions for overflow
addu, addiu and subu do not cause exceptions for overflow
Computer Architecture Fall 2001 8 of 61
Chapter 4: Arithmetic for Computers Arithmetic Operations
Supporting slt
MIPS offers two sets of set-on-less-than
For signed intergers
Set-on-less-than, slt
Set-on-less-than-immediate slti
For unsigned intergers
Set-on-less-than-unsigned, slu
Set-on-less-than-immediate-unsigned sltiu
Example: Suppose
$s0 = 1111 1111 1111 1111 1111 1111 1111 1111 = 1signed2= 4, 294, 967, 295unsigned10
$s1 = 0000 0000 0000 0000 0000 0000 0000 0001 = 1signed2= 1unsigned10
What would the following do
slt $t0 $s0, $s1 # singend comparison
sltu $t1 $s0, $s1 # unsigned comparison
Register $t0 has value 1, since 110 < 110 andregister $t1 has value 0, since 4, 294, 967, 29510 > 110
Computer Architecture Fall 2001 9 of 61
Chapter 4: Arithmetic for Computers Boolean Algebra & Gates
Boolean Algebra & Gates
Logical Operations
Bytes can be supported using shift operations
MIPS provides:
shift left logical = sll, and
shift right logical = srl operations
sll $t2, $s0, 8 # $t2= $s0 srl
AND & and, andi
OR | or, ori
Computer Architecture Fall 2001 10 of 61
Chapter 4: Arithmetic for Computers Constructing an ALU
Constructing an ALU
ALUs are constructed from four hardware building blocks, AND,
OR, Inverter, and Multiplexor(MUX)
c = a bba000010001111
b
ac
b
ac
a c
c = a + bba
000110101111
1001
c = aa
a0b1
cd
0
1
a
c
b
d
1. AND gate (c = a b)
2. OR gate (c = a + b)
3. Inverter (c = a)
4. Multiplexor (if d = = 0, c = a; else c = b)
MIPS has a 32-bit wide ALU
Computer Architecture Fall 2001 11 of 61
Chapter 4: Arithmetic for Computers Constructing an ALU
A 1-bit ALU
A 1-bit logical unit can be constructed by an AND, and OR and a
MUX
b
0
1
Result
Operation
a
A 1-bit half adder consists of two input operands, the sum, and
one output carryOut
Inputs Outputs SOP
a b CarryOut Sum CarryOut Sum
0 0 0 0 a+ b a+ b
0 1 0 1 a+ b ab
1 0 0 1 a+ b ab
1 1 1 0 ab a+ b
SumSOP = ab + ab
CSOP = ab
Computer Architecture Fall 2001 12 of 61
Chapter 4: Arithmetic for Computers Constructing an ALU
A 1-bit full adder consists of two input operands, one input
carrayIn, the sum, and one output carryOut
Sum
CarryIn
CarryOut
a
b
Inputs Outputs Outputsa b CarryIn CarryOut Sum Comments Co Sum
0 0 0 0 0 0+0+0=00
0 0 1 0 1 0+0+1=01 abci0 1 0 0 1 0+1+0=01 abci0 1 1 1 0 0+1+1=10 abci1 0 0 0 1 1+0+0=01 abci1 0 1 1 0 1+0+1=10 abci1 1 0 1 0 1+1+0=10 abci1 1 1 1 1 1+1+1=11 abc abc
The truth table can be expressed as SOP
SumSOP = abci + abci + abci + abc
CSOP = abci + abci + abci + abciCSOP = abci + abci + abci + abci + abci + abciCSOP = ab + aci + bci
There are many ways to make adders using different boolean
expressions
Computer Architecture Fall 2001 13 of 61
Chapter 4: Arithmetic for Computers Constructing an ALU
Carry out gates
b
CarryOut
a
CarryIn
Sum can be integrated to this circuit
A 1-bit ALU that performs and, or, and add operation
b
0
2
Result
Operation
a
1
CarryIn
CarryOut
Computer Architecture Fall 2001 14 of 61
Chapter 4: Arithmetic for Computers Constructing an ALU
Subtraction
Subtractions can be performed using 2s complement and a 2:1
Mux
To compute 2s complement the carrayIn of the LSB is set to 1
A 1-bit ALU that performs and, or, addition and subtraction
0
2
Result
Operation
a
1
CarryIn
CarryOut
0
1
Binvert
b
Computer Architecture Fall 2001 15 of 61
Chapter 4: Arithmetic for Computers Constructing an ALU
A 32-bit Adder
A 32-bit adder can be constructed by replication of 32 full adders,
(ripple carry adder)
Result31a31
b31
Result0
CarryIn
a0
b0
Result1a1
b1
Result2a2
b2
Operation
ALU0
CarryIn
CarryOut
ALU1
CarryIn
CarryOut
ALU2
CarryIn
CarryOut
ALU31
CarryIn
Computer Architecture Fall 2001 16 of 61
Chapter 4: Arithmetic for Computers A 32-bit ALU
A 32-bit ALU
Tailoring the ALU to the MIPS
Need to support the set-on-less-than instruction (slt)
Remember: slt is an arithmetic instruction
It produces a 1 if rs < rt = (rs rt < 0) and 0 otherwise
Use subtraction: (a b) < 0 = a < b
Need to support test for equality (beq $t5, $t6, $t7)
Use subtraction: (a b) = 0 = a = b, page 18 or Fig 4.17
A 32-bit ALU constructed from 31 copies of 1-bit ALU of type a,
page 18 or Fig. 4.17, and one copy of 1-bit ALU of type b
(overflow detection)
The sign bit of the operation rs rt can be ported to the leastsignificant bit (if there is no overflow), page 19 or Fig. 4.18
When overflow, the sign bit is wrong and we need the inverse of
the sign bit
To accommodate conditional branch beq and bne, we can test
the result of rs rt, if 0 then beq, otherwise bne. The simpleway is to NOR all outputs together, page 20
Computer Architecture Fall 2001 17 of 61
Chapter 4: Arithmetic for Computers A 32-bit ALU
Computer Architecture Fall 2001 18 of 61
Chapter 4: Arithmetic for Computers A 32-bit ALU
Seta31
0
ALU0 Result0
CarryIn
a0
Result1a1
0
Result2a2
0
Operation
b31
b0
b1
b2
Result31
Overflow
Binvert
CarryIn
Less
CarryIn
CarryOut
ALU1Less
CarryIn
CarryOut
ALU2Less
CarryIn
CarryOut
ALU31Less
CarryIn
Computer Architecture Fall 2001 19 of 61
Chapter 4: Arithmetic for Computers A 32-bit ALU
Seta31
0
Result0a0
Result1a1
0
Result2a2
0
Operation
b31
b0
b1
b2
Result31
Overflow
Bnegate
Zero
ALU0Less
CarryIn
CarryOut
ALU1Less
CarryIn
CarryOut
ALU2Less
CarryIn
CarryOut
ALU31Less
CarryIn
ALU ResultZero
Overflow
a
b
ALU operation
CarryOut
Computer Architecture Fall 2001 20 of 61
Chapter 4: Arithmetic for Computers A 32-bit ALU
Notice control lines:
000 = and
001 = or
010 = add
110 = subtract
111 = slt
Note: zero is a 1 when the result is zero
A Few Conclusions
We can build an ALU that supports basic MIPS instruction
Key ideas
Use multiplexor to select the output we want
Use twos complement to perform subtraction efficiently
Replicate a 1-bit ALU to produce a 32-bit ALU
Important points about hardware
All of the gates are always working
The speed of a gate is affected by the number of inputs to the
gate
The speed of a circuit is affected by the number of gates in
series (critical path)
Our focus: comprehension, however,
Clever changes to organization can improve performance
Similar to using better algorithms in software
We will look at two examples for addition and multiplications
in the next section
Computer Architecture Fall 2001 21 of 61
Chapter 4: Arithmetic for Computers Addition
Addition
Problem
Ripple carry adder is too slow
Is a 32-bit ALU as fast as a 1-bit ALU?
Is there more than one way to do addition?
Two extremes:
Ripple carry
Not feasible! why?
SOP using infinite hardware
Not feasible! why?
Carry-lookahead is an approach between two extremes
Carry Lookahead
Motivation:
If we didnt know the value of CarryIn, what could we do?
When would we always generate a carry? gi = aibi
When would we propagate the carry? pi = ai + bi
c1 = g0 + p0c0c2 = g1 + p1c1c3 = g2 + p2c2c4 = g3 + p3c3
Computer Architecture Fall 2001 22 of 61
Chapter 4: Arithmetic for Computers Addition
Did we get rid of the ripple?
Is it feasible! Why?
Several parallel approaches have been used to speed up the ALU
as a function of log2 32, with more gate complexity
Carry lookahead employs special carry circuits to generate each ci(CarryIn for the ith bit) rapidly.
c1 = b0c0 + a0c0 + a0b0.
c2 = b1c1 + a1c1 + a1b1substituting c2 by c1,
c2 =
(a1a0b0)+(a1a0c0)+(a1b0c0)+(b1a0b0)+(b1a0c0)+(b1a0c0)+(a1b1).
The complexity grows as we compute ci for higher bits
We can limit the hardware complexity, yet making substantial
improvement
Define generate gi = aibi
Define propagate pi = ai + bi
ci = 1 if
{gi1 = 1 orpi1 = ci1 = 1
pi and gi can be used to express cis
For a 4-bit ALU
c1 = g0 + p0c0c2 = g1 + p1g0 + p1p0c0c3 = g2 + p2g1 + p2p1g0 + p2p1p0c0c4 = g3 + p3g2 + p3p2g1 + p3p2p1g0 + p3p2p1p0c0
Computer Architecture Fall 2001 23 of 61
Chapter 4: Arithmetic for Computers Addition
This means that ci = 1 if some earlier adder generates a carry
and all intermediary adders propagate a carry
Using the Principle to Build Bigger Adders
CarryIn
Result0--3
ALU0
CarryIn
Result4--7
ALU1
CarryIn
Result8--11
ALU2
CarryIn
CarryOut
Result12--15
ALU3
CarryIn
C1
C2
C3
C4
P0G0
P1G1
P2G2
P3G3
pigi
pi + 1gi + 1
ci + 1
ci + 2
ci + 3
ci + 4
pi + 2gi + 2
pi + 3gi + 3
a0b0a1b1a2b2a3b3
a4b4a5b5a6b6a7b7
a8b8a9b9
a10b10a11b11
a12b12a13b13a14b14a15b15
Carry-lookahead unit
Computer Architecture Fall 2001 24 of 61
Chapter 4: Arithmetic for Computers Multiplication
Multiplication
Multiplication is performed as the sum of partial products
The partial product is 0 when the multiplier bit is 0, and the
partial product is the multiplicand when multiplier bit is 1
The total product is the sum of left shifted of partial products
The multiplication of two n-bit binary integer results in a product
of up to 2n bit is length
1011
x 1101
-------
1011
0000
1011
1011
------------
10001111
The operation can be made more efficient by
Performing addition on the partial products rather than waiting
until the end (saving storage)
No need for addition when the multiplier is 0, but shift is
required for both
Computer Architecture Fall 2001 25 of 61
Chapter 4: Arithmetic for Computers Multiplication
Multiplication Implementation: First Version
32-bit multiplier, 64-bit multiplicand, 64-bit ALU, and 64-bit
product
64-bit ALU
Control test
MultiplierShift right
ProductWrite
MultiplicandShift left
64 bits
64 bits
32 bits
Computer Architecture Fall 2001 26 of 61
Chapter 4: Arithmetic for Computers Multiplication
Done
1. TestMultiplier0
1a. Add multiplicand to product andplace the result in Product register
2. Shift the Multiplicand register left 1 bit
3. Shift the Multiplier register right 1 bit
32nd repetition?
Start
Multiplier0 = 0Multiplier0 = 1
No: < 32 repetitions
Yes: 32 repetitions
Second Version
In average half of the bits of the multiplicand are 0, therefore
64-bit ALU seemed wasteful
Instead of shifting the multiplicand left, we shift the product to
the right, after adding multiplicand to the left half of the product
The algorithm uses 32-bit multiplicand, 32-bit multiplier, 32-bit
adder, and 64-bit product
Computer Architecture Fall 2001 27 of 61
Chapter 4: Arithmetic for Computers Multiplication
MultiplierShift right
Write
32 bits
64 bits
32 bits
Shift right
Multiplicand
32-bit ALU
Product Control test
Done
1. TestMultiplier0
1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register
2. Shift the Product register right 1 bit
3. Shift the Multiplier register right 1 bit
32nd repetition?
Start
Multiplier0 = 0Multiplier0 = 1
No: < 32 repetitions
Yes: 32 repetitions
Computer Architecture Fall 2001 28 of 61
Chapter 4: Arithmetic for Computers Multiplication
Third Version
The right half of the product can be used to store the multiplier
the procedure, and the example, respectively
ControltestWrite
32 bits
64 bits
Shift rightProduct
Multiplicand
32-bit ALU
Computer Architecture Fall 2001 29 of 61
Chapter 4: Arithmetic for Computers Multiplication
Done
1. TestProduct0
1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register
2. Shift the Product register right 1 bit
32nd repetition?
Start
Product0 = 0Product0 = 1
No: < 32 repetitions
Yes: 32 repetitions
The above algorithms will not work if both numbers are not
positive1001 9 1001 -7
x 0011 3 x 0011 300001001 1111100100010010 1111001000011011 27 11101011 -21Unsigned 2s complement
Computer Architecture Fall 2001 30 of 61
Chapter 4: Arithmetic for Computers Multiplication
Signed Multiplication
Consider the following of two unsigned integers yielding an 8-bit
result
1011x 1101
00001011 1011 1 2000000000 1011 0 2100101100 1011 1 2201011000 1011 1 2310001111 143
Each partial product (other than that for 20) consists of the 4-bit
multiplicand (1011) shifted to the left with the unoccupied
positions on the right filled with 0s
However, this straightforward multiplication will not work if the
multiplicand is negative
Each contribution of the negative multiplicand as a partial
product must be a negative number on a 2n-bit (64-bit) field
The sign bits of the partial products must line up
Unsigned Integers 2s Complement Integers1001 9 1001 7
x 1101 3 x 0011 300001001 1001 1 20 11111001 7 = 1001 1 20 = 700010010 1001 1 21 11110010 7 = 1001 1 21 = 1400100100 1001 0 22 00000000 7 = 1001 0 22 = 001001000 1001 0 23 00000000 7 = 1001 0 23 = 000011011 27 21
Computer Architecture Fall 2001 31 of 61
Chapter 4: Arithmetic for Computers Multiplication
If the multiplier is negative, straightforward multiplication will not
work. The reason is that the bits of the multiplier no longer
correspond to the shifts or multiplications that must take place.
There are a number of ways out of this dilemma:
Convert both multiplier and multiplicand to positive numbers,
perform multiplication, and the take the 2s complement of the
result if-and-only-if the sign of the two original numbers differed
Using Booths algorithm = no 2s complement conversion= faster
Booths Algorithm
First Consider a positive multiplier consisting of one block of 1ssurrounded by 0s
M (00011110) = M (24 + 23 + 22 + 21)= M (16 + 8 + 4 + 2)= M 30
The number of such operations can be reduced to two if we
observed that
2n + 2n1 + + 2nk = 2n+1 2nk (1)
Therefore,M (00011110) = M (25 21)
= M (32 2)= M 30
So, the product can be generated by one addition and one
subtraction of the multiplicand
Computer Architecture Fall 2001 32 of 61
Chapter 4: Arithmetic for Computers Multiplication
This scheme can be extended to any number of blocks of 1s in amultiplier, including a block of a single 1. Thus
M (01111010) = M (26 + 25 + 24 + 23 + 21)= M (27 23 + 22 21)= M 122
Therefore, subtraction is performed when the first 1 of the block is
encountered (1-0) and, addition is performed when the end of the
block is encountered (0-1)
As an example M (6)6 = 11111010 = 27 + 26 + 25 + 24 + 23 + 21
M (11111010) = M (27 + 26 + 25 + 24 + 23 + 21)
It performs a subtraction when the first 1 is encountered (1-0)
It performs a addition when the first 1 is encountered (0-1)
It performs a subtraction when the first 1 of the next block of
1s is encountered (1-0)
Booths algorithm is based on the fact that a string of 0s avoids
arithmetic (only shift)
Otherwise, the string can be classified into
Beginning of a run of 1s
Middle of a run of 1s
End of a run of 1s
Computer Architecture Fall 2001 33 of 61
Chapter 4: Arithmetic for Computers Multiplication
The Algorithm:
Trace 2 bits of the multiplier at a time, and if it is
00 Middle of a string of 0, no arithmetic operation.01 End of a string of 1, add the multiplicand to the left half of
the product.10 Beginning of a string of 1, subtract the multiplicand from the
left half of the product.11 Middle of a string of 1, no arithmetic operation.
Computer Architecture Fall 2001 34 of 61
Chapter 4: Arithmetic for Computers Multiplication
There is a 1-bit register placed logically to the right of the LSB of
the product
Example: 0111 0011 (7 3). Multiplicand = 0111.Product0000 0011 0 Initial value, extra bit is padded
(10) sub 1001 0011 0shift 1100 1001 1(11) shift 1110 0100 1(01) add 0101 0100 1(01) shift 0010 1010 0(00) shift 0001 0101 0 = 21
Example: 0111 1101, 73. Multiplicand = 0111.Product0000 1101 0 Initial value, extra bit is padded
(10) sub 1001 1101 0shift 1100 1110 1(01) add 0011 1110 1shift 0001 1111 0(10) sub 1010 1111 0shift 1101 0111 1(11) shift 1110 1011 1 = - 21
Example: 0010 0110, 2 6. Multiplicand = 0010.Product0000 0110 0 Initial value, extra bit is padded
(00) shift 0000 0011 0(10) sub 1110 0011 0shift 1111 0001 1(11) shift 1111 1000 1(01) add 0001 1000 1shift 0000 1100 0 = 12
Computer Architecture Fall 2001 35 of 61
Chapter 4: Arithmetic for Computers Multiplication
Example: 1001 1101, 7. Multiplicand = 1001Product0000 1101 0 Initial value, extra bit is padded
(10) sub 0111 1101 0shift 0011 1110 1(01) add 1100 1110 1shift 1110 0111 0(10) sub 0101 0111 0shift 0010 1011 1(11) shift 0001 0101 1 = 21
Computer Architecture Fall 2001 36 of 61
Chapter 4: Arithmetic for Computers Multiplication
Division
It is somewhat more complex than multiplication
Procedure:
1. The bits of the dividend are examined from left to right,
until the set of bits examined represents a number divisor.until this event occur, 0s are placed in the quotient from left
to right
2. When the event occurs, a 1 is placed in the quotient and a
divisor is subtracted from the partial dividend
3. Append additional bits from the dividend (if any) to the new
partial remainder. If the result divisor, go to 1
An example of long division of unsigned binary integers
0 0 0 0 1 1 0 1 1 0 1 1 )1 0 0 1 0 0 1 1 0 0 1 1 1 0 | |
1 0 1 1 | |
___________ v v
Partial Reminder ---> 0 0 1 1 1 1
1 0 1 1
___________
1 0 0
Chapter 4: Arithmetic for Computers Multiplication
The size of the dividend is limited by the sum of the sizes of the
divisor and quotient
First Version
A 32-bit Quotient register, initially set to 0
The divisor is placed in the left half of the 64-bit Divisor register
At each step, the Divisor register is shifted to the right (1 bit) to
be aligned with the Dividend.
64-bit ALU
Controltest
QuotientShift left
RemainderWrite
DivisorShift right
64 bits
64 bits
32 bits
Computer Architecture Fall 2001 38 of 61
Chapter 4: Arithmetic for Computers Multiplication
Done
Test Remainder
2a. Shift the Quotient register to the left,setting the new rightmost bit to 1
3. Shift the Divisor register right 1 bit
33rd repetition?
Start
Remainder < 0
No: < 33 repetitions
Yes: 33 repetitions
2b. Restore the original value by addingthe Divisor register to the Remainder
register and place the sum in theRemainder register. Also shift the
Quotient register to the left, setting thenew least significant bit to 0
1. Subtract the Divisor register from theRemainder register and place the result in the Remainder register
Remainder > 0
Computer Architecture Fall 2001 39 of 61
Chapter 4: Arithmetic for Computers Multiplication
Second Version
By switching the order of the operation to shift and then subtract,
One iteration of the algorithm can be removed
The ALU and Divisor register can be cut in half
Shifting the remainder to the left instead of shifting the divisor to
the right produces the same alignment
Controltest
QuotientShift left
Write
32 bits
64 bits
32 bits
Shift left
Divisor
32-bit ALU
Remainder
Computer Architecture Fall 2001 40 of 61
Chapter 4: Arithmetic for Computers Multiplication
Third Version
The Quotient register could eliminated by shifting the bits of the
quotient into the Reminder instead of shifting in 0s
Both the remainder and the quotient will be shifted when a shift is
made on the the remainder register,
This causes the remainder to be shifted an extra position to the
left and correction is required,
Write
32 bits
64 bits
Shift leftShift right
Remainder
32-bit ALU
Divisor
Controltest
Computer Architecture Fall 2001 41 of 61
Chapter 4: Arithmetic for Computers Multiplication
Done. Shift left half of Remainder right 1 bit
Test Remainder
3a. Shift the Remainder register to the left, setting the new rightmost bit to 1
32nd repetition?
Start
Remainder < 0
No: < 32 repetitions
Yes: 32 repetitions
3b. Restore the original value by addingthe Divisor register to the left half of theRemainder register and place the sum
in the left half of the Remainder register.Also shift the Remainder register to theleft, setting the new rightmost bit to 0
2. Subtract the Divisor register from theleft half of the Remainder register andplace the result in the left half of the
Remainder register
Remainder 0
1. Shift the Remainder register left 1 bit
>
Computer Architecture Fall 2001 42 of 61
Chapter 4: Arithmetic for Computers Multiplication
Signed Division
For sign numbers, the sign of the divisor and dividend should be
saved
Sign of Quotient = Sign of Dividend Sign of Divisor
For a nonzero remainder,
Sign of Remainder = Sign of Dividend
The same hardware can be used for both multiply and divide using
a 64-bit shift register for product (in multiply) or remainder (in
divide)
MIPS has two multiply instructions: multiply(mult) and
multiply unsigned(multu)
MIPS has two divide instructions: divide(dvi) and
divide unsigned(dviu)
To fetch the 32-bit product (remainder), MIPS has
move from lo(mflo) instruction
To avoid overflow, Hi must be 0 for multu or must be the
replicated sign of Lo for mult
The instruction move from hi transfers Hi to a register to test
for overflow
MIPS divide instructions ignore overflow
Both overflow and dived-by-0 are handled by MIPS software
Computer Architecture Fall 2001 43 of 61
Chapter 4: Arithmetic for Computers Floating Point Representation
Floating Point Representation
With a fixed-point notation (2s complement) it is possible to
represent a range of positive and negative integers centered on 0
By assuming a fixed point binary or radix point, this format allows
the representation of numbers with fractional component as well
This approach has some limitations
Very large numbers or very small fractions can not be
represented
The fractional part of the quotient in a division of two large
numbers could be lost
Using scientific notations, we get around this limitation
976, 000, 000, 000, 000 = 9.76 10140.0000000000000976 = 9.76 1014
Dynamically slide the decimal point to a convenient location and
use the exponent of 10 to keep track of the decimal point
A range of very large and very small numbers can be represented
with only few digits
This approach can be used with binary (or any other base)
The number S BE can be stored in a binary word with treefields:
Sign Biased Exponent Significand(or mantissa)1-bit 8-bits 23-bits
32-bit floating-point format
Computer Architecture Fall 2001 44 of 61
Chapter 4: Arithmetic for Computers Floating Point Representation
There must be a compromise between the size of the significand
and the size of the exponent
In this case, the 8-bit field yields the number 0 through 255
A fixed value, called the bias, subtracted from the field to get the
true exponent value
With a bias of 128, the true exponent values are in the range -128
to +127, with 0 typically 10000000
Therefore any floating point number can be expressed in many
ways
To simplify operations on floating-point numbers, it is typically
required that they be normalized
A normalized number is one in the form
0.1bbb b 2Ewhere b is either 0 or 1
Since the MSB of the significand field is always 1, there is no need
to store it. Thus the 23-bit field is used to store a 24-bit
significand with values between 0.5 and 1.0
Examples
0.11010001 210100 = 0 10010100 101000100000000000000000.11010001 210100 = 1 10010100 10100010000000000000000
0.11010001 210100 = 0 01101100 101000100000000000000000.11010001 210100 = 1 01101100 10100010000000000000000
Note that the value 128 is added to the true exponent to be
stored in the exponent field
(1 224) 2127 Negative Numbers 0.5 2128
Computer Architecture Fall 2001 45 of 61
Chapter 4: Arithmetic for Computers Floating Point Representation
0.5 2128 Positive Numbers (1 224) 2127
Five regions on the number line are not included in these ranges
Negative overflow: negative numbers < (1 224) 2127
Negative underflow: negative numbers > 0.5 2128
Zero
Positive underflow: positive numbers < 0.5 2128
Positive overflow: positive numbers > (1 224) 2127
There is a trade-off between accuracy and range
Actual floating-point representations include a special bit pattern
to designate zero
To reduce the chances of underflow or overflow, most
programming languages offer double precision floating-point
operations, by expanding the exponent field
MIPS provides an extraordinary range
2.0 1038 Single precision fraction 2.0 238
2.0 10308 Double precision fraction 2.0 2308
Computer Architecture Fall 2001 46 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
IEEE 754 Floating-point Standard
Since 1980, IEEE 754 floating point standard has been
implemented in virtually every computer
IEEE 754 standard recommends
Significand of 24 (23+ 1 implicit) bits for single precision
Significand of 53 (52+ 1 implicit) bits for double precision
Since 0 has no leading 1, it is given the reserved exponent value
0 and the hardware wont attach a leading 1 to the significand
A bias of 127. This means that the value is
(1)S (1 + significand) 2exponentbias
where S is the sign bit
Sign Biased Exponent Significand(or mantissa)1-bit 8-bits 23-bits
IEEE single precision floating-point format
Sign Biased Exponent Significand(or mantissa)1-bit 11-bits 52-bits
IEEE double precision floating-point format
Computer Architecture Fall 2001 47 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Leading 1 bit of significand is implicit
The mantissa is represented as a 2s complement
The exponent is represented in a biased (excess) form to make
sorting easier
All 0s is smallest exponent, all 1s is largest
The range 0 to 2m 1 of m bits is relabeled2m1 to +2m1 1 by subtracting a constant value (bias) of2m1
Bias of 127 for single precision and 1023 for double precision
IEEE format labels the exponent differently 126 E +127 rather than 127 E +128 127 is used to encode 0
+128 is used to encode or NaN (not a number)Binary True exponent Biased form
0000 0000 127 0
0000 0001 126 1
0000 0010 125 2
0111 1111 0 127
1000 0000 1 128
1111 1101 + 126 253
1111 1101 + 127 254
1111 1111 + 128 255
Computer Architecture Fall 2001 48 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Word Size (bits) 32 64
Exponent size 8 11Exponent base 2 2Max Exponent 127 1023Min Exponent -126 -1022Range (base 10) 1038 10+38 10308 10+308Significand size (bits) 23 52Number of exponents 254 2046Number of fractions 223 252
Number of values 1.98 231 1.99 263
Floating Point Operations
Floating Point Numbers
X = XsbXe
Y = YsbYe
Arithmetic Operations
X + Y = (XSbXeYe + Ys) bYe
X Y = (XSbXeYe + Ys) bYe}
if Xe Ye
X Y = (Xs Ys) bXe+YeX/Y = (Xs/Ys) bXeYe
Computer Architecture Fall 2001 49 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Floating Point Complexities
Operations are somewhat more complicated
In addition to overflow, we have underflow
Accuracy can be a big problem
IEEE 754 keeps two extra bits, guard and round
Four rounding modes
Positive divided by zero yields infinity
Zero divided by zero yields not a number
Other complexities
Implementing the standard can be tricky
Not using the standard can be even worse
See text for description of 80x86 and Pentium Bug!
Computer Architecture Fall 2001 50 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Guard Bits
Loading mantissa or exponent into ALU registers results in some
extra bits called guard bits
They affect the precision of the result
Consider X = 1.000 000 21 and Y = 1.111 111 20What is Z = X Y ?
X Y Without Guard BitsX = 1.000 000 21Y = 0.111 111 21 Y loses 1 bit of significandZ = 0.000 001 21
= 1.000 001 223
X Y With Guard BitsX = 1.000 000 00000000 21Y = 0.111 111 10000000 21Z = 0.000 000 10000000 21
= 1.000 000 00000000 224No bit lost, result differs by factor 2
Computer Architecture Fall 2001 51 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Rounding
The result of an ALU operation on significands is put back into a
floating point format
Extra bits must be disposed
Some techniques:
Round to nearest representable number
Round to +
Round to
Round to 0
Computer Architecture Fall 2001 52 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Round to Nearest
If the extra bits amount more than one-half of the last
representable bit, add 1 to lo the LSB of the significand, otherwise
drop them
Extra bits 10010 add 1 to LSB of the significand
Extra bits 01111 drop the extra bit
If the extra bits amount exactly halfway between two representable
values randomly round up or down to avoid cumulative bias
In IEEE, if the result of a computation is exactly midway between
two representable numbers:
Round up if the last representable bit is 1
Truncate if the last representable bit is 0
Computer Architecture Fall 2001 53 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Examples
157.3 + 257.1, assume 6-bit bias (excess 32) and 10-bit mantissa
Convert to binary and normalize
157.3 = 10011101.01001 = 0.1001110101001 28257.1 = 100000001.0001 = 0.1000000010001 29
Round the mantissa to ten bits
157.3 = 0.1001110101001 28 = 0.100111011 28257.1 = 0.1000000010001 29 = 0.100000001 29
Make the smaller exponent equal to the larger exponent and add
157.3 = 0.010011101 29257.1 = 0.100000001 29414.0 = 0.110011110 29
The result is already normalized. The result differs from the exact
answer (414.4) by 0.4
Computer Architecture Fall 2001 54 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Floating-Point Arithmetic
For addition and subtraction, it is necessary to ensure that both
operands have the same exponent. This may require shifting the
radix point on one of the operands to achieve alignment
Problems may arise as the result of these operations
Exponent overflow: A positive exponent exceeds the limit In
some system this may be designated as + or
Exponent underflow: A negative exponent exceeds the limit
This means the number is too small and in some system this
may be reported as 0
Significand underflow: In the process of aligning significand,
digits may flow off the right end of the significand
Significand overflow: The addition of two significands of the
same sign may result in a carry out of the most significand bit
Computer Architecture Fall 2001 55 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Addition and Subtraction
In floating-point arithmetic, addition and subtraction are more
complex than multiplication and division. This is because of the
need for alignment
There are four basic phases of the algorithm for addition and
subtraction. Example: 9.999 101 + 1.610 1011. Check for zeros
2. Align the significand of the number with smaller exponent to
match the larger exponent, 1.610 101 = 0.0161 1013. Add or subtract the significands, 9.999 + 0.016 = 10.015
4. Normalize the result, 10.015 = 1.0015 1025. Round the result, 1.0015 102 = 1.002 102, Fig. 4.44
Example:0.5 0.43751. 0.5 = 1.000120.4375 = 1.11022
2. 1.00012 0.111123. 1.00012 0.11112 = 0.0012 214. 0.0012 21 = 1.0002 24
Many machines have dedicated hardware to perform floating point
operations as fast as possible, Fig. 4.45
Computer Architecture Fall 2001 56 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Done
2. Add the significands
4. Round the significand to the appropriatenumber of bits
Still normalized?
Start
Yes
No
No
YesOverflow orunderflow?
Exception
3. Normalize the sum, either shifting right andincrementing the exponent or shifting left
and decrementing the exponent
1. Compare the exponents of the two numbers.Shift the smaller number to the right until itsexponent would match the larger exponent
Computer Architecture Fall 2001 57 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
0 10 1 0 1
Control
Small ALU
Big ALU
Sign Exponent Significand Sign Exponent Significand
Exponentdifference
Shift right
Shift left or right
Rounding hardware
Sign Exponent Significand
Increment ordecrement
0 10 1
Shift smallernumber right
Compareexponents
Add
Normalize
Round
Computer Architecture Fall 2001 58 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Floating Point Multiplication and Division
Floating-point multiplication and division are much simpler than
addition and subtraction
1. Check for either operands to be 0
2. Add the exponents. If the exponents are stored in biased
form, the exponent sum would have doubled the bias. Thus
the bias must be subtracted from the sum
3. The result could be either an exponent overflow or underflow,
which would be reported
4. If the exponent oft he product is within the proper range, the
next step is to multiply the significands (taking into account
their signs). The product will be double the length of the
multiplier and multiplicand. The extra bits will be lost during
rounding
5. The result is then normalized and rounded. Normalization
could result in exponent underflow
Rounding requires extra bits for calculation. IEEE 754 keeps 2
extra bits on the right during intermediate calculations, called
guard andround bits
Computer Architecture Fall 2001 59 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Computer Architecture Fall 2001 60 of 61
Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard
Example
0.50.4375 = (1.0002 21) (1.110 22)1. New exponent= -1 + (-2) = -3. or using the biased
representation: (-1+127)+(-2+127)-127= 124
2. Multiplying the significand
1.0002 1.110 = 11100002 = 1.1100002 23 = 1.1102 233. The product is already normalized
4. No rounding is required,
5. The sign of the result is negative
MIPS supports IEEE single and double precision formats:
add.s, sub.s, mul.s, and dvi.s, for single precision
add.d, sub.d, mul.d, dvi.d, for double precision
MIPS also supports comparison and branch instructions for single
and double precision
MIPS has floating point register to manipulate floating point
numbers only 16 of the 32 MIPS floating-point registers can be
used for single precision operations. Double precision is computed
using pairs of these registers
MIPS also has load and store instructions to move floating point
numbers
Computer Architecture Fall 2001 61 of 61