ch4

COMPUTER ARCHITECTURECS 45101CS 55101

CHAPTER 4

Arithmetic for Computers

H. Peyravi

Department of Computer Science

Kent State University

Fall 2001

Computer Architecture Fall 2001

Contents

Introduction MIPS: 32 bit Signed Numbers Arithmetic Operations Boolean Algebra & Gates Constructing an ALU A 32-bit ALU Addition Multiplication Floating Point Representation IEEE 754 Floating-point Standard

Computer Architecture Fall 2001

Chapter 4: Arithmetic for Computers Introduction

Introduction

Goals

To introduce 2s complement numbers and their addition and

subtraction

To introduce basic logic operations (AND, OR, logical shift), and

hardware building blocks (AND, OR, NOT gates and multiplexor)

To explain the construction and operation of a 32-bit ALU that

performs AND, OR, add, and slt

To show algorithms that perform multiplication and division and

hardware that implements these algorithms

To demonstrate floating point representation and arithmetic

operations

To introduce MIPS instructions for unsigned operands, for logic

operations, and for floating point numbers

Numbers

In binary system how do we represent negative numbers?

What is the largest or smallest number that can fit into a word?

What would happen to overflow and underflow case?

How fractions and real numbers are represented?

Computer Architecture Fall 2001 1 of 61

Chapter 4: Arithmetic for Computers Introduction

Possible Representation

Sign Magnitude Ones Complement Twos Complement

000 = +0 000 = +0 000 = +0

001 = +1 001 = +1 001 = +1

010 = +2 010 = +2 010 = +2

011 = +3 011 = +3 011 = +3

100 = -0 100 = -3 100 = -4

101 = -1 101 = -2 101 = -3

110 = -2 110 = -1 110 = -2

111 = -3 111 = -0 111 = -1

Issues:

Balanced ?

Number of zeros ?

Ease of Operations ?

Which one is best ?


Chapter 4: Arithmetic for Computers MIPS: 32 bit Signed Numbers

MIPS: 32 bit Signed Numbers

Negative Numbers

Numbers can be represented in any base (human10, and

machines2)

The value of the ith digit d in a number with base b

d bi

The bits are numbered from right to left

MIPS word is 32 bits long, starting from 0 to

232 1 = 4, 294, 967, 29510

Although 32 is an even number, but is not possible to divide it

into equal number of positive and negation numbers, since 0 needs

to be represented as well

The alternative is to have unbalanced positive and negative

numbers

Positive numbers: 0 to 231 1 = 2, 147, 483, 64710

Negative numbers: 231 = 2, 147, 483, 64810 to1 = 232 1

There is one negative number, 231 = 2, 147, 483, 64810 thathas no corresponding positive positions



0000 0000 0000 0000 0000 0000 0000 0000 = 0100000 0000 0000 0000 0000 0000 0000 0001 = 1100000 0000 0000 0000 0000 0000 0000 0010 = 210 0111 1111 1111 1111 1111 1111 1111 1101 = 2, 147, 483, 645100111 1111 1111 1111 1111 1111 1111 1110 = 2, 147, 483, 646100111 1111 1111 1111 1111 1111 1111 1111 = 2, 147, 483, 64710 Max int1000 0000 0000 0000 0000 0000 0000 0000 = 2, 147, 483, 64810 Min int1000 0000 0000 0000 0000 0000 0000 0001 = 2, 147, 483, 647101000 0000 0000 0000 0000 0000 0000 0010 = 2, 147, 483, 64610 1111 1111 1111 1111 1111 1111 1111 1101 = 3101111 1111 1111 1111 1111 1111 1111 1110 = 2101111 1111 1111 1111 1111 1111 1111 1111 = 110

This convention is called 2s complement, it insures that

x + (x) = 0

0 has one representative

Hardware test of a negative number requires only checking the

most significant bit (sign bit, hence hardware simplification

Further, address locations are all positive, and the programmer

can convert a negative number to an unsigned integer (as in C)

A positive or negative number can be described as

(x31 231) + (x30 230) + + (x1 21) + (x0 20)

In comparing a negative integer with a positive integer, the MSB

of the negative integer is greater than the MSB of the positive

integer, which is not the case for unsigned integers



Twos Complement Operations

In order to negate a number, complement each bit and then add 1

to the result. For example

210 = 0000 0000 0000 0000 0000 0000 0000 00102210 = 1111 1111 1111 1111 1111 1111 1111 11012 + 1

= 1111 1111 1111 1111 1111 1111 1111 11102

Converting n bit numbers to m bit numbers, where, m > n

Example sign extention:

MIPS 16 bit immediate gets converted to 32 bits

Copy the MSB (sign bit) into the other bits

0010 000000101010 11111010

Example:

lb = load byte lbu = load byte unsigned


Chapter 4: Arithmetic for Computers Arithmetic Operations

Arithmetic Operations

Addition

Except the right most digits, we have a carry (0 or 1) for the

subsequent digits and hence 3 operands

0 1 0 0 1 0 1 1

1 1 1 1 0 0 0 1

(1) 0 (1) 0 (0)1 (0)1 (0)1 (0)1 (1)0 (1) 0

(1) (0) (0) (0) (0) (1) (1) Carries

The carry bit beyond the end of word is ignored.1001 = -7 1100 = -4 0011 = +30101 = +5 0100 = +4 0100 = +41110 = -2 0000 = 0 0111 = +7

The carry bit beyond the end of word is ignored1100 = -4 0101 = +5 1001 = -71111 = -1 0100 = +4 1010 = -61011 = -5 1001 = OF 0011 = OF

How do we detect overflow?

Overflow occurs iff the result has the opposite sign



Subtraction

To subtract one number(subtrahend) from another (minuend),

take 2s complement of the subtrahend and add it to the minuend2 7 5 ( 2) 5 2

0010 = + 2 0101 = + 5 0101 = + 51001 = 7 0010 = + 2 1110 = 21011 = 5 0111 = + 7 0011 = + 3

7 ( 7) 5 2 6 40111 = + 7 1011 = 5 1010 = 60111 = + 7 1110 = 2 1100 = 41110 = OF 1001 = 7 0110 = OF

How to Detect Overflow?

No overflow when adding a positive and a negative number

No overflow when the signs are the same for subtraction

Overflow occurs when:

adding two positives yields a negative, or

adding two negatives yields a positive, or

subtract a negative from a positive and get a negative, or

subtract a positive from a negative and get a positive



Effects of Overflow

An exception (interrupt) occurs

Control jumps to predefined address for exception

Address of the offending instruction is saved for possible

resumption

MIPS saves the address of the instruction that overflowed in a

register called Exception Program Counter, EPC, and an interrupt

handler routine is called

Details based on software system/language

flight control vs. homework assignment

MIPS uses register $k0 and $k1 to save the return address

from EPC

Unsigned integers could cause overflow, but they are not

considered overflow since they are commonly used for memory

address

Note that overflow term is somehow misleading, it doesnt mean a

a carry overflowed

In MIPS

add, addi and sub cause exceptions for overflow

addu, addiu and subu do not cause exceptions for overflow



Supporting slt

MIPS offers two sets of set-on-less-than

For signed intergers

Set-on-less-than, slt

Set-on-less-than-immediate slti

For unsigned intergers

Set-on-less-than-unsigned, slu

Set-on-less-than-immediate-unsigned sltiu

Example: Suppose

$s0 = 1111 1111 1111 1111 1111 1111 1111 1111 = 1signed2= 4, 294, 967, 295unsigned10

$s1 = 0000 0000 0000 0000 0000 0000 0000 0001 = 1signed2= 1unsigned10

What would the following do

slt $t0 $s0, $s1 # singend comparison

sltu $t1 $s0, $s1 # unsigned comparison

Register $t0 has value 1, since 110 < 110 andregister $t1 has value 0, since 4, 294, 967, 29510 > 110


Chapter 4: Arithmetic for Computers Boolean Algebra & Gates

Boolean Algebra & Gates

Logical Operations

Bytes can be supported using shift operations

MIPS provides:

shift left logical = sll, and

shift right logical = srl operations

sll $t2, $s0, 8 # $t2= $s0 srl

AND & and, andi

OR | or, ori


Chapter 4: Arithmetic for Computers Constructing an ALU

Constructing an ALU

ALUs are constructed from four hardware building blocks, AND,

OR, Inverter, and Multiplexor(MUX)

c = a bba000010001111

b

ac

b

ac

a c

c = a + bba

000110101111

1001

c = aa

a0b1

cd

0

1

a

c

b

d

1. AND gate (c = a b)

2. OR gate (c = a + b)

3. Inverter (c = a)

4. Multiplexor (if d = = 0, c = a; else c = b)

MIPS has a 32-bit wide ALU



A 1-bit ALU

A 1-bit logical unit can be constructed by an AND, and OR and a

MUX

b

0

1

Result

Operation

a

A 1-bit half adder consists of two input operands, the sum, and

one output carryOut

Inputs Outputs SOP

a b CarryOut Sum CarryOut Sum

0 0 0 0 a+ b a+ b

0 1 0 1 a+ b ab

1 0 0 1 a+ b ab

1 1 1 0 ab a+ b

SumSOP = ab + ab

CSOP = ab



A 1-bit full adder consists of two input operands, one input

carrayIn, the sum, and one output carryOut

Sum

CarryIn

CarryOut

a

b

Inputs Outputs Outputsa b CarryIn CarryOut Sum Comments Co Sum

0 0 0 0 0 0+0+0=00

0 0 1 0 1 0+0+1=01 abci0 1 0 0 1 0+1+0=01 abci0 1 1 1 0 0+1+1=10 abci1 0 0 0 1 1+0+0=01 abci1 0 1 1 0 1+0+1=10 abci1 1 0 1 0 1+1+0=10 abci1 1 1 1 1 1+1+1=11 abc abc

The truth table can be expressed as SOP

SumSOP = abci + abci + abci + abc

CSOP = abci + abci + abci + abciCSOP = abci + abci + abci + abci + abci + abciCSOP = ab + aci + bci

There are many ways to make adders using different boolean

expressions



Carry out gates

b

CarryOut

a

CarryIn

Sum can be integrated to this circuit

A 1-bit ALU that performs and, or, and add operation

b

0

2

Result

Operation

a

1

CarryIn

CarryOut



Subtraction

Subtractions can be performed using 2s complement and a 2:1

Mux

To compute 2s complement the carrayIn of the LSB is set to 1

A 1-bit ALU that performs and, or, addition and subtraction

0

2

Result

Operation

a

1

CarryIn

CarryOut

0

1

Binvert

b



A 32-bit Adder

A 32-bit adder can be constructed by replication of 32 full adders,

(ripple carry adder)

Result31a31

b31

Result0

CarryIn

a0

b0

Result1a1

b1

Result2a2

b2

Operation

ALU0

CarryIn

CarryOut

ALU1

CarryIn

CarryOut

ALU2

CarryIn

CarryOut

ALU31

CarryIn


Chapter 4: Arithmetic for Computers A 32-bit ALU

A 32-bit ALU

Tailoring the ALU to the MIPS

Need to support the set-on-less-than instruction (slt)

Remember: slt is an arithmetic instruction

It produces a 1 if rs < rt = (rs rt < 0) and 0 otherwise

Use subtraction: (a b) < 0 = a < b

Need to support test for equality (beq $t5, $t6, $t7)

Use subtraction: (a b) = 0 = a = b, page 18 or Fig 4.17

A 32-bit ALU constructed from 31 copies of 1-bit ALU of type a,

page 18 or Fig. 4.17, and one copy of 1-bit ALU of type b

(overflow detection)

The sign bit of the operation rs rt can be ported to the leastsignificant bit (if there is no overflow), page 19 or Fig. 4.18

When overflow, the sign bit is wrong and we need the inverse of

the sign bit

To accommodate conditional branch beq and bne, we can test

the result of rs rt, if 0 then beq, otherwise bne. The simpleway is to NOR all outputs together, page 20



Seta31

0

ALU0 Result0

CarryIn

a0

Result1a1

0

Result2a2

0

Operation

b31

b0

b1

b2

Result31

Overflow

Binvert

CarryIn

Less

CarryIn

CarryOut

ALU1Less

CarryIn

CarryOut

ALU2Less

CarryIn

CarryOut

ALU31Less

CarryIn



Seta31

0

Result0a0

Result1a1

0

Result2a2

0

Operation

b31

b0

b1

b2

Result31

Overflow

Bnegate

Zero

ALU0Less

CarryIn

CarryOut

ALU1Less

CarryIn

CarryOut

ALU2Less

CarryIn

CarryOut

ALU31Less

CarryIn

ALU ResultZero

Overflow

a

b

ALU operation

CarryOut



Notice control lines:

000 = and

001 = or

010 = add

110 = subtract

111 = slt

Note: zero is a 1 when the result is zero

A Few Conclusions

We can build an ALU that supports basic MIPS instruction

Key ideas

Use multiplexor to select the output we want

Use twos complement to perform subtraction efficiently

Replicate a 1-bit ALU to produce a 32-bit ALU

Important points about hardware

All of the gates are always working

The speed of a gate is affected by the number of inputs to the

gate

The speed of a circuit is affected by the number of gates in

series (critical path)

Our focus: comprehension, however,

Clever changes to organization can improve performance

Similar to using better algorithms in software

We will look at two examples for addition and multiplications

in the next section


Chapter 4: Arithmetic for Computers Addition

Addition

Problem

Ripple carry adder is too slow

Is a 32-bit ALU as fast as a 1-bit ALU?

Is there more than one way to do addition?

Two extremes:

Ripple carry

Not feasible! why?

SOP using infinite hardware

Not feasible! why?

Carry-lookahead is an approach between two extremes

Carry Lookahead

Motivation:

If we didnt know the value of CarryIn, what could we do?

When would we always generate a carry? gi = aibi

When would we propagate the carry? pi = ai + bi

c1 = g0 + p0c0c2 = g1 + p1c1c3 = g2 + p2c2c4 = g3 + p3c3



Did we get rid of the ripple?

Is it feasible! Why?

Several parallel approaches have been used to speed up the ALU

as a function of log2 32, with more gate complexity

Carry lookahead employs special carry circuits to generate each ci(CarryIn for the ith bit) rapidly.

c1 = b0c0 + a0c0 + a0b0.

c2 = b1c1 + a1c1 + a1b1substituting c2 by c1,

c2 =

(a1a0b0)+(a1a0c0)+(a1b0c0)+(b1a0b0)+(b1a0c0)+(b1a0c0)+(a1b1).

The complexity grows as we compute ci for higher bits

We can limit the hardware complexity, yet making substantial

improvement

Define generate gi = aibi

Define propagate pi = ai + bi

ci = 1 if

{gi1 = 1 orpi1 = ci1 = 1

pi and gi can be used to express cis

For a 4-bit ALU

c1 = g0 + p0c0c2 = g1 + p1g0 + p1p0c0c3 = g2 + p2g1 + p2p1g0 + p2p1p0c0c4 = g3 + p3g2 + p3p2g1 + p3p2p1g0 + p3p2p1p0c0



This means that ci = 1 if some earlier adder generates a carry

and all intermediary adders propagate a carry

Using the Principle to Build Bigger Adders

CarryIn

Result0--3

ALU0

CarryIn

Result4--7

ALU1

CarryIn

Result8--11

ALU2

CarryIn

CarryOut

Result12--15

ALU3

CarryIn

C1

C2

C3

C4

P0G0

P1G1

P2G2

P3G3

pigi

pi + 1gi + 1

ci + 1

ci + 2

ci + 3

ci + 4

pi + 2gi + 2

pi + 3gi + 3

a0b0a1b1a2b2a3b3

a4b4a5b5a6b6a7b7

a8b8a9b9

a10b10a11b11

a12b12a13b13a14b14a15b15

Carry-lookahead unit


Chapter 4: Arithmetic for Computers Multiplication

Multiplication

Multiplication is performed as the sum of partial products

The partial product is 0 when the multiplier bit is 0, and the

partial product is the multiplicand when multiplier bit is 1

The total product is the sum of left shifted of partial products

The multiplication of two n-bit binary integer results in a product

of up to 2n bit is length

1011

x 1101

-------

1011

0000

1011

1011

------------

10001111

The operation can be made more efficient by

Performing addition on the partial products rather than waiting

until the end (saving storage)

No need for addition when the multiplier is 0, but shift is

required for both



Multiplication Implementation: First Version

32-bit multiplier, 64-bit multiplicand, 64-bit ALU, and 64-bit

product

64-bit ALU

Control test

MultiplierShift right

ProductWrite

MultiplicandShift left

64 bits

64 bits

32 bits



Done

1. TestMultiplier0

1a. Add multiplicand to product andplace the result in Product register

2. Shift the Multiplicand register left 1 bit

3. Shift the Multiplier register right 1 bit

32nd repetition?

Start

Multiplier0 = 0Multiplier0 = 1

No: < 32 repetitions

Yes: 32 repetitions

Second Version

In average half of the bits of the multiplicand are 0, therefore

64-bit ALU seemed wasteful

Instead of shifting the multiplicand left, we shift the product to

the right, after adding multiplicand to the left half of the product

The algorithm uses 32-bit multiplicand, 32-bit multiplier, 32-bit

adder, and 64-bit product



MultiplierShift right

Write

32 bits

64 bits

32 bits

Shift right

Multiplicand

32-bit ALU

Product Control test

Done

1. TestMultiplier0

1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register

2. Shift the Product register right 1 bit

3. Shift the Multiplier register right 1 bit

32nd repetition?

Start

Multiplier0 = 0Multiplier0 = 1


Yes: 32 repetitions



Third Version

The right half of the product can be used to store the multiplier

the procedure, and the example, respectively

ControltestWrite

32 bits

64 bits

Shift rightProduct

Multiplicand

32-bit ALU



Done

1. TestProduct0

1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register

2. Shift the Product register right 1 bit

32nd repetition?

Start

Product0 = 0Product0 = 1


Yes: 32 repetitions

The above algorithms will not work if both numbers are not

positive1001 9 1001 -7

x 0011 3 x 0011 300001001 1111100100010010 1111001000011011 27 11101011 -21Unsigned 2s complement



Signed Multiplication

Consider the following of two unsigned integers yielding an 8-bit

result

1011x 1101

00001011 1011 1 2000000000 1011 0 2100101100 1011 1 2201011000 1011 1 2310001111 143

Each partial product (other than that for 20) consists of the 4-bit

multiplicand (1011) shifted to the left with the unoccupied

positions on the right filled with 0s

However, this straightforward multiplication will not work if the

multiplicand is negative

Each contribution of the negative multiplicand as a partial

product must be a negative number on a 2n-bit (64-bit) field

The sign bits of the partial products must line up

Unsigned Integers 2s Complement Integers1001 9 1001 7

x 1101 3 x 0011 300001001 1001 1 20 11111001 7 = 1001 1 20 = 700010010 1001 1 21 11110010 7 = 1001 1 21 = 1400100100 1001 0 22 00000000 7 = 1001 0 22 = 001001000 1001 0 23 00000000 7 = 1001 0 23 = 000011011 27 21



If the multiplier is negative, straightforward multiplication will not

work. The reason is that the bits of the multiplier no longer

correspond to the shifts or multiplications that must take place.

There are a number of ways out of this dilemma:

Convert both multiplier and multiplicand to positive numbers,

perform multiplication, and the take the 2s complement of the

result if-and-only-if the sign of the two original numbers differed

Using Booths algorithm = no 2s complement conversion= faster

Booths Algorithm

First Consider a positive multiplier consisting of one block of 1ssurrounded by 0s

M (00011110) = M (24 + 23 + 22 + 21)= M (16 + 8 + 4 + 2)= M 30

The number of such operations can be reduced to two if we

observed that

2n + 2n1 + + 2nk = 2n+1 2nk (1)

Therefore,M (00011110) = M (25 21)

= M (32 2)= M 30

So, the product can be generated by one addition and one

subtraction of the multiplicand



This scheme can be extended to any number of blocks of 1s in amultiplier, including a block of a single 1. Thus

M (01111010) = M (26 + 25 + 24 + 23 + 21)= M (27 23 + 22 21)= M 122

Therefore, subtraction is performed when the first 1 of the block is

encountered (1-0) and, addition is performed when the end of the

block is encountered (0-1)

As an example M (6)6 = 11111010 = 27 + 26 + 25 + 24 + 23 + 21

M (11111010) = M (27 + 26 + 25 + 24 + 23 + 21)

It performs a subtraction when the first 1 is encountered (1-0)

It performs a addition when the first 1 is encountered (0-1)

It performs a subtraction when the first 1 of the next block of

1s is encountered (1-0)

Booths algorithm is based on the fact that a string of 0s avoids

arithmetic (only shift)

Otherwise, the string can be classified into

Beginning of a run of 1s

Middle of a run of 1s

End of a run of 1s



The Algorithm:

Trace 2 bits of the multiplier at a time, and if it is

00 Middle of a string of 0, no arithmetic operation.01 End of a string of 1, add the multiplicand to the left half of

the product.10 Beginning of a string of 1, subtract the multiplicand from the

left half of the product.11 Middle of a string of 1, no arithmetic operation.



There is a 1-bit register placed logically to the right of the LSB of

the product

Example: 0111 0011 (7 3). Multiplicand = 0111.Product0000 0011 0 Initial value, extra bit is padded

(10) sub 1001 0011 0shift 1100 1001 1(11) shift 1110 0100 1(01) add 0101 0100 1(01) shift 0010 1010 0(00) shift 0001 0101 0 = 21

Example: 0111 1101, 73. Multiplicand = 0111.Product0000 1101 0 Initial value, extra bit is padded

(10) sub 1001 1101 0shift 1100 1110 1(01) add 0011 1110 1shift 0001 1111 0(10) sub 1010 1111 0shift 1101 0111 1(11) shift 1110 1011 1 = - 21

Example: 0010 0110, 2 6. Multiplicand = 0010.Product0000 0110 0 Initial value, extra bit is padded

(00) shift 0000 0011 0(10) sub 1110 0011 0shift 1111 0001 1(11) shift 1111 1000 1(01) add 0001 1000 1shift 0000 1100 0 = 12



Example: 1001 1101, 7. Multiplicand = 1001Product0000 1101 0 Initial value, extra bit is padded

(10) sub 0111 1101 0shift 0011 1110 1(01) add 1100 1110 1shift 1110 0111 0(10) sub 0101 0111 0shift 0010 1011 1(11) shift 0001 0101 1 = 21



Division

It is somewhat more complex than multiplication

Procedure:

1. The bits of the dividend are examined from left to right,

until the set of bits examined represents a number divisor.until this event occur, 0s are placed in the quotient from left

to right

2. When the event occurs, a 1 is placed in the quotient and a

divisor is subtracted from the partial dividend

3. Append additional bits from the dividend (if any) to the new

partial remainder. If the result divisor, go to 1

An example of long division of unsigned binary integers

0 0 0 0 1 1 0 1 1 0 1 1 )1 0 0 1 0 0 1 1 0 0 1 1 1 0 | |

1 0 1 1 | |

___________ v v

Partial Reminder ---> 0 0 1 1 1 1

1 0 1 1

___________

1 0 0


The size of the dividend is limited by the sum of the sizes of the

divisor and quotient

First Version

A 32-bit Quotient register, initially set to 0

The divisor is placed in the left half of the 64-bit Divisor register

At each step, the Divisor register is shifted to the right (1 bit) to

be aligned with the Dividend.

64-bit ALU

Controltest

QuotientShift left

RemainderWrite

DivisorShift right

64 bits

64 bits

32 bits



Done

Test Remainder

2a. Shift the Quotient register to the left,setting the new rightmost bit to 1

3. Shift the Divisor register right 1 bit

33rd repetition?

Start

Remainder < 0


Yes: 33 repetitions

2b. Restore the original value by addingthe Divisor register to the Remainder

register and place the sum in theRemainder register. Also shift the

Quotient register to the left, setting thenew least significant bit to 0

1. Subtract the Divisor register from theRemainder register and place the result in the Remainder register

Remainder > 0



Second Version

By switching the order of the operation to shift and then subtract,

One iteration of the algorithm can be removed

The ALU and Divisor register can be cut in half

Shifting the remainder to the left instead of shifting the divisor to

the right produces the same alignment

Controltest

QuotientShift left

Write

32 bits

64 bits

32 bits

Shift left

Divisor

32-bit ALU

Remainder



Third Version

The Quotient register could eliminated by shifting the bits of the

quotient into the Reminder instead of shifting in 0s

Both the remainder and the quotient will be shifted when a shift is

made on the the remainder register,

This causes the remainder to be shifted an extra position to the

left and correction is required,

Write

32 bits

64 bits

Shift leftShift right

Remainder

32-bit ALU

Divisor

Controltest



Done. Shift left half of Remainder right 1 bit

Test Remainder

3a. Shift the Remainder register to the left, setting the new rightmost bit to 1

32nd repetition?

Start

Remainder < 0


Yes: 32 repetitions

3b. Restore the original value by addingthe Divisor register to the left half of theRemainder register and place the sum

in the left half of the Remainder register.Also shift the Remainder register to theleft, setting the new rightmost bit to 0

2. Subtract the Divisor register from theleft half of the Remainder register andplace the result in the left half of the

Remainder register

Remainder 0

1. Shift the Remainder register left 1 bit

>



Signed Division

For sign numbers, the sign of the divisor and dividend should be

saved

Sign of Quotient = Sign of Dividend Sign of Divisor

For a nonzero remainder,

Sign of Remainder = Sign of Dividend

The same hardware can be used for both multiply and divide using

a 64-bit shift register for product (in multiply) or remainder (in

divide)

MIPS has two multiply instructions: multiply(mult) and

multiply unsigned(multu)

MIPS has two divide instructions: divide(dvi) and

divide unsigned(dviu)

To fetch the 32-bit product (remainder), MIPS has

move from lo(mflo) instruction

To avoid overflow, Hi must be 0 for multu or must be the

replicated sign of Lo for mult

The instruction move from hi transfers Hi to a register to test

for overflow

MIPS divide instructions ignore overflow

Both overflow and dived-by-0 are handled by MIPS software


Chapter 4: Arithmetic for Computers Floating Point Representation

Floating Point Representation

With a fixed-point notation (2s complement) it is possible to

represent a range of positive and negative integers centered on 0

By assuming a fixed point binary or radix point, this format allows

the representation of numbers with fractional component as well

This approach has some limitations

Very large numbers or very small fractions can not be

represented

The fractional part of the quotient in a division of two large

numbers could be lost

Using scientific notations, we get around this limitation

976, 000, 000, 000, 000 = 9.76 10140.0000000000000976 = 9.76 1014

Dynamically slide the decimal point to a convenient location and

use the exponent of 10 to keep track of the decimal point

A range of very large and very small numbers can be represented

with only few digits

This approach can be used with binary (or any other base)

The number S BE can be stored in a binary word with treefields:

Sign Biased Exponent Significand(or mantissa)1-bit 8-bits 23-bits

32-bit floating-point format



There must be a compromise between the size of the significand

and the size of the exponent

In this case, the 8-bit field yields the number 0 through 255

A fixed value, called the bias, subtracted from the field to get the

true exponent value

With a bias of 128, the true exponent values are in the range -128

to +127, with 0 typically 10000000

Therefore any floating point number can be expressed in many

ways

To simplify operations on floating-point numbers, it is typically

required that they be normalized

A normalized number is one in the form

0.1bbb b 2Ewhere b is either 0 or 1

Since the MSB of the significand field is always 1, there is no need

to store it. Thus the 23-bit field is used to store a 24-bit

significand with values between 0.5 and 1.0

Examples

0.11010001 210100 = 0 10010100 101000100000000000000000.11010001 210100 = 1 10010100 10100010000000000000000

0.11010001 210100 = 0 01101100 101000100000000000000000.11010001 210100 = 1 01101100 10100010000000000000000

Note that the value 128 is added to the true exponent to be

stored in the exponent field

(1 224) 2127 Negative Numbers 0.5 2128



0.5 2128 Positive Numbers (1 224) 2127

Five regions on the number line are not included in these ranges

Negative overflow: negative numbers < (1 224) 2127

Negative underflow: negative numbers > 0.5 2128

Zero

Positive underflow: positive numbers < 0.5 2128

Positive overflow: positive numbers > (1 224) 2127

There is a trade-off between accuracy and range

Actual floating-point representations include a special bit pattern

to designate zero

To reduce the chances of underflow or overflow, most

programming languages offer double precision floating-point

operations, by expanding the exponent field

MIPS provides an extraordinary range

2.0 1038 Single precision fraction 2.0 238

2.0 10308 Double precision fraction 2.0 2308


Chapter 4: Arithmetic for Computers IEEE 754 Floating-point Standard

IEEE 754 Floating-point Standard

Since 1980, IEEE 754 floating point standard has been

implemented in virtually every computer

IEEE 754 standard recommends

Significand of 24 (23+ 1 implicit) bits for single precision

Significand of 53 (52+ 1 implicit) bits for double precision

Since 0 has no leading 1, it is given the reserved exponent value

0 and the hardware wont attach a leading 1 to the significand

A bias of 127. This means that the value is

(1)S (1 + significand) 2exponentbias

where S is the sign bit


IEEE single precision floating-point format


IEEE double precision floating-point format



Leading 1 bit of significand is implicit

The mantissa is represented as a 2s complement

The exponent is represented in a biased (excess) form to make

sorting easier

All 0s is smallest exponent, all 1s is largest

The range 0 to 2m 1 of m bits is relabeled2m1 to +2m1 1 by subtracting a constant value (bias) of2m1

Bias of 127 for single precision and 1023 for double precision

IEEE format labels the exponent differently 126 E +127 rather than 127 E +128 127 is used to encode 0

+128 is used to encode or NaN (not a number)Binary True exponent Biased form

0000 0000 127 0

0000 0001 126 1

0000 0010 125 2

0111 1111 0 127

1000 0000 1 128

1111 1101 + 126 253

1111 1101 + 127 254

1111 1111 + 128 255



Word Size (bits) 32 64

Exponent size 8 11Exponent base 2 2Max Exponent 127 1023Min Exponent -126 -1022Range (base 10) 1038 10+38 10308 10+308Significand size (bits) 23 52Number of exponents 254 2046Number of fractions 223 252

Number of values 1.98 231 1.99 263

Floating Point Operations

Floating Point Numbers

X = XsbXe

Y = YsbYe

Arithmetic Operations

X + Y = (XSbXeYe + Ys) bYe

X Y = (XSbXeYe + Ys) bYe}

if Xe Ye

X Y = (Xs Ys) bXe+YeX/Y = (Xs/Ys) bXeYe



Floating Point Complexities

Operations are somewhat more complicated

In addition to overflow, we have underflow

Accuracy can be a big problem

IEEE 754 keeps two extra bits, guard and round

Four rounding modes

Positive divided by zero yields infinity

Zero divided by zero yields not a number

Other complexities

Implementing the standard can be tricky

Not using the standard can be even worse

See text for description of 80x86 and Pentium Bug!



Guard Bits

Loading mantissa or exponent into ALU registers results in some

extra bits called guard bits

They affect the precision of the result

Consider X = 1.000 000 21 and Y = 1.111 111 20What is Z = X Y ?

X Y Without Guard BitsX = 1.000 000 21Y = 0.111 111 21 Y loses 1 bit of significandZ = 0.000 001 21

= 1.000 001 223

X Y With Guard BitsX = 1.000 000 00000000 21Y = 0.111 111 10000000 21Z = 0.000 000 10000000 21

= 1.000 000 00000000 224No bit lost, result differs by factor 2



Rounding

The result of an ALU operation on significands is put back into a

floating point format

Extra bits must be disposed

Some techniques:

Round to nearest representable number

Round to +

Round to

Round to 0



Round to Nearest

If the extra bits amount more than one-half of the last

representable bit, add 1 to lo the LSB of the significand, otherwise

drop them

Extra bits 10010 add 1 to LSB of the significand

Extra bits 01111 drop the extra bit

If the extra bits amount exactly halfway between two representable

values randomly round up or down to avoid cumulative bias

In IEEE, if the result of a computation is exactly midway between

two representable numbers:

Round up if the last representable bit is 1

Truncate if the last representable bit is 0



Examples

157.3 + 257.1, assume 6-bit bias (excess 32) and 10-bit mantissa

Convert to binary and normalize

157.3 = 10011101.01001 = 0.1001110101001 28257.1 = 100000001.0001 = 0.1000000010001 29

Round the mantissa to ten bits

157.3 = 0.1001110101001 28 = 0.100111011 28257.1 = 0.1000000010001 29 = 0.100000001 29

Make the smaller exponent equal to the larger exponent and add

157.3 = 0.010011101 29257.1 = 0.100000001 29414.0 = 0.110011110 29

The result is already normalized. The result differs from the exact

answer (414.4) by 0.4



Floating-Point Arithmetic

For addition and subtraction, it is necessary to ensure that both

operands have the same exponent. This may require shifting the

radix point on one of the operands to achieve alignment

Problems may arise as the result of these operations

Exponent overflow: A positive exponent exceeds the limit In

some system this may be designated as + or

Exponent underflow: A negative exponent exceeds the limit

This means the number is too small and in some system this

may be reported as 0

Significand underflow: In the process of aligning significand,

digits may flow off the right end of the significand

Significand overflow: The addition of two significands of the

same sign may result in a carry out of the most significand bit



Addition and Subtraction

In floating-point arithmetic, addition and subtraction are more

complex than multiplication and division. This is because of the

need for alignment

There are four basic phases of the algorithm for addition and

subtraction. Example: 9.999 101 + 1.610 1011. Check for zeros

2. Align the significand of the number with smaller exponent to

match the larger exponent, 1.610 101 = 0.0161 1013. Add or subtract the significands, 9.999 + 0.016 = 10.015

4. Normalize the result, 10.015 = 1.0015 1025. Round the result, 1.0015 102 = 1.002 102, Fig. 4.44

Example:0.5 0.43751. 0.5 = 1.000120.4375 = 1.11022

2. 1.00012 0.111123. 1.00012 0.11112 = 0.0012 214. 0.0012 21 = 1.0002 24

Many machines have dedicated hardware to perform floating point

operations as fast as possible, Fig. 4.45



Done

2. Add the significands

4. Round the significand to the appropriatenumber of bits

Still normalized?

Start

Yes

No

No

YesOverflow orunderflow?

Exception

3. Normalize the sum, either shifting right andincrementing the exponent or shifting left

and decrementing the exponent

1. Compare the exponents of the two numbers.Shift the smaller number to the right until itsexponent would match the larger exponent



0 10 1 0 1

Control

Small ALU

Big ALU

Sign Exponent Significand Sign Exponent Significand

Exponentdifference

Shift right

Shift left or right

Rounding hardware

Sign Exponent Significand

Increment ordecrement

0 10 1

Shift smallernumber right

Compareexponents

Add

Normalize

Round



Floating Point Multiplication and Division

Floating-point multiplication and division are much simpler than

addition and subtraction

1. Check for either operands to be 0

2. Add the exponents. If the exponents are stored in biased

form, the exponent sum would have doubled the bias. Thus

the bias must be subtracted from the sum

3. The result could be either an exponent overflow or underflow,

which would be reported

4. If the exponent oft he product is within the proper range, the

next step is to multiply the significands (taking into account

their signs). The product will be double the length of the

multiplier and multiplicand. The extra bits will be lost during

rounding

5. The result is then normalized and rounded. Normalization

could result in exponent underflow

Rounding requires extra bits for calculation. IEEE 754 keeps 2

extra bits on the right during intermediate calculations, called

guard andround bits



Example

0.50.4375 = (1.0002 21) (1.110 22)1. New exponent= -1 + (-2) = -3. or using the biased

representation: (-1+127)+(-2+127)-127= 124

2. Multiplying the significand

1.0002 1.110 = 11100002 = 1.1100002 23 = 1.1102 233. The product is already normalized

4. No rounding is required,

5. The sign of the result is negative

MIPS supports IEEE single and double precision formats:

add.s, sub.s, mul.s, and dvi.s, for single precision

add.d, sub.d, mul.d, dvi.d, for double precision

MIPS also supports comparison and branch instructions for single

and double precision

MIPS has floating point register to manipulate floating point

numbers only 16 of the 32 MIPS floating-point registers can be

used for single precision operations. Double precision is computed

using pairs of these registers

MIPS also has load and store instructions to move floating point

numbers


ch4

Documents