EI 209 Chapter 3.1 CSE, 2015
EI 209 Computer Organization
Fall 2015
Chapter 3: Arithmetic forComputers
Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/ )
[Adapted from Computer Organization and Design, 4th Edition, Patterson & Hennessy, © 2012, MK]
EI 209 Chapter 3.2 CSE, 2015
Review: MIPS (RISC) Design Principles
Simplicity favors regularity fixed size instructions small number of instruction formats opcode always the first 6 bits
Smaller is faster limited instruction set limited number of registers in register file limited number of addressing modes
Make the common case fast arithmetic operands from the register file (load-store machine) allow instructions to contain immediate operands
Good design demands good compromises three instruction formats
EI 209 Chapter 3.3 CSE, 2015
Specifying Branch Destinations Use a register (like in lw and sw) added to the 16-bit offset
which register? Instruction Address Register (the PC)- its use is automatically implied by instruction- PC gets updated (PC+4) during the fetch cycle so that it holds the
address of the next instruction limits the branch distance to -215 to +215-1 (word) instructions from
the (instruction after the) branch instruction, but most branches are local anyway
PCAdd
32
32 3232
32
offset
16
32
00
sign-extend
from the low order 16 bits of the branch instruction
branch dstaddress
?Add
4 32
EI 209 Chapter 3.4 CSE, 2015
MIPS also has an unconditional branch instruction or jump instruction:
j label #go to label
Other Control Flow Instructions
Instruction Format (J Format):
0x02 26-bit address
PC4
32
26
32
00
from the low order 26 bits of the jump instruction
Why shift left by two bits?
EI 209 Chapter 3.5 CSE, 2015
Review: MIPS Addressing Modes Illustrated1. Register addressing
op rs rt rd funct Register
word operand
op rs rt offset
2. Base (displacement) addressing
base register
Memory
word or byte operand
3. Immediate addressing
op rs rt operand
4. PC-relative addressing
op rs rt offset
Program Counter (PC)
Memory
branch destination instruction
5. Pseudo-direct addressing
op jump address
Program Counter (PC)
Memory
jump destination instruction||
EI 209 Chapter 3.6 CSE, 2015
32-bit signed numbers (2’s complement):
0000 0000 0000 0000 0000 0000 0000 0000two = 0ten
0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten...
0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten
0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten
1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten
1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten...
1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten
Number Representations
maxint
minint
Converting <32-bit values into 32-bit values copy the most significant bit (the sign bit) into the “empty” bits
0010 -> 0000 00101010 -> 1111 1010
sign extend versus zero extend (lb vs. lbu)
MSB
LSB
EI 209 Chapter 3.7 CSE, 2015
MIPS Arithmetic Logic Unit (ALU) Must support the Arithmetic/Logic
operations of the ISAadd, addi, addiu, addu
sub, subu
mult, multu, div, divu
sqrt
and, andi, nor, or, ori, xor, xori
beq, bne, slt, slti, sltiu, sltu
32
32
32
m (operation)
result
A
B
ALU
4
zero ovf
11
With special handling for sign extend – addi, addiu, slti, sltiu zero extend – andi, ori, xori overflow detection – add, addi, sub
EI 209 Chapter 3.8 CSE, 2015
Dealing with Overflow
Operation Operand A Operand B Result indicating overflow
A + B ≥ 0 ≥ 0 < 0
A + B < 0 < 0 ≥ 0
A - B ≥ 0 < 0 < 0
A - B < 0 ≥ 0 ≥ 0
Overflow occurs when the result of an operation cannot be represented in 32-bits, i.e., when the sign bit contains a value bit of the result and not the proper sign bit
When adding operands with different signs or when subtracting operands with the same sign, overflow can never occur
MIPS signals overflow with an exception (aka interrupt) – an unscheduled procedure call where the EPC contains the address of the instruction that caused the exception
EI 209 Chapter 3.10 CSE, 2015
Just like in grade school (carry/borrow 1s) 0111 0111 0110+ 0110 - 0110 - 0101
Two's complement operations are easy
do subtraction by negating and then adding
0111 0111 - 0110 + 1010
Overflow (result too large for finite computer word)
e.g., adding two n-bit numbers does not yield an n-bit number 0111+ 0001
Addition & Subtraction
1101 0001 0001
0001 1 0001
1000
EI 209 Chapter 3.11 CSE, 2015
Building a 1-bit Binary Adder
1 bit Full Adder
A
BS
carry_in
carry_out
S = A xor B xor carry_in carry_out = A&B | A&carry_in | B&carry_in (majority function)
How can we use it to build a 32-bit adder?
How can we modify it easily to build an adder/subtractor?
A B carry_in carry_out S
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
EI 209 Chapter 3.12 CSE, 2015
Building 32-bit Adder
1-bit FA
A0
B0
S0
c0=carry_in
c1
1-bit FA
A1
B1
S1
c2
1-bit FA
A2
B2
S2
c3
c32=carry_out
1-bit FA
A31
B31
S31
c31
. .
.
Just connect the carry-out of the least significant bit FA to the carry-in of the next least significant bit and connect . . .
Ripple Carry Adder (RCA) advantage: simple logic, so small
(low cost)
disadvantage: slow and lots of glitching (so lots of energy consumption)
EI 209 Chapter 3.14 CSE, 2015
A 32-bit Ripple Carry Adder/Subtractor
Remember 2’s complement is just
complement all the bits
add a 1 in the least significant bit
A 0111 0111 B - 0110 +
1-bit FA S0
c0=carry_in
c1
1-bit FA S1
c2
1-bit FA S2
c3
c32=carry_out
1-bit FA S31
c31
. .
.
A0
A1
A2
A31
B0
B1
B2
B31
add/sub
B0
control(0=add,1=sub) B0 if control = 0
!B0 if control = 1
0001
1001 1
1 0001
EI 209 Chapter 3.15 CSE, 2015
Overflow Detection Logic
Carry into MSB ! = Carry out of MSB For a N-bit ALU: Overflow = CarryIn [N-1] XOR CarryOut [N-1]
Overflow
X Y X XOR Y
0 0 01 1
1 0 11 1 0
A0
B0
1-bitALU
Result0
CarryIn0
CarryOut0
A1
B1
1-bitALU
Result1
CarryIn1
CarryOut1
A2
B2
1-bitALU
Result2
CarryIn2
CarryOut2A3
B3
1-bitALU
Result3
CarryIn3
CarryOut3
0
why?
EI 209 Chapter 3.16 CSE, 2015
Multiply
Binary multiplication is just a bunch of right shifts and adds
multiplicand
multiplier
partialproductarray
double precision product
n
2n
ncan be formed in parallel and added in parallel for faster multiplication
EI 209 Chapter 3.17 CSE, 2015
More complicated than addition Can be accomplished via shifting and adding
0010 (multiplicand) x_1011 (multiplier)
0010 0010 (partial product
0000 array) 0010 00010110 (product)
In every step• multiplicand is shifted• next bit of multiplier is examined (also a shifting step)• if this bit is 1, shifted multiplicand is added to the product
Multiplication
EI 209 Chapter 3.18 CSE, 2015
In every step• multiplicand is shifted• next bit of multiplier is examined (also a shifting step)• if this bit is 1, shifted multiplicand is added to the product
Multiplication Algorithm 1
EI 209 Chapter 3.19 CSE, 2015
EI 209 Chapter 3.20 CSE, 2015
Comments on Multiplicand Algorithm 1
Performance Three basic steps for each bit It requires 100 clock cycles to multiply two
32-bit numbers If each step took a clock cycle,
How to improve it?
Motivation (Performing the operations in parallel): Putting multiplier and the product together Shift them together
EI 209 Chapter 3.21 CSE, 2015
Refined Multiplicand Algorithm 2
multiplicand
32-bit ALU
multiplier Control
addshiftright
product
• 32-bit ALU and multiplicand is untouched• the sum keeps shifting right• at every step, number of bits in product + multiplier = 64,
hence, they share a single 64-bit register
EI 209 Chapter 3.22 CSE, 2015
Add and Right Shift Multiplier Hardware
multiplicand
32-bit ALU
multiplier Control
addshiftright
product
0 1 1 0 = 6
0 0 0 0 0 1 0 1 = 5add 0 1 1 0 0 1 0 1
0 0 1 1 0 0 1 0add 0 0 1 1 0 0 1 0
0 0 0 1 1 0 0 1 add 0 1 1 1 1 0 0 1
0 0 0 1 1 1 1 0add 0 0 1 1 1 1 0 0
0 0 1 1 1 1 0 0
= 30
EI 209 Chapter 3.23 CSE, 2015
Exercise
Using 4-bit numbers to save space, multiply 2ten*3ten, or 0010two * 0011two
EI 209 Chapter 3.24 CSE, 2015
Division
Division is just a bunch of quotient digit guesses and left shifts and subtracts
dividend = quotient x divisor + remainder
dividenddivisor
partialremainderarray
quotientnn
remainder
n
0 0 0
0
0
0
EI 209 Chapter 3.25 CSE, 2015
Division
1001ten QuotientDivisor 1000ten | 1001010ten Dividend -1000 10 101 1010 -1000 10ten Remainder
At every step,• shift divisor right and compare it with current dividend• if divisor is larger, shift 0 as the next bit of the quotient• if divisor is smaller, subtract to get new dividend and shift 1
as the next bit of the quotient
EI 209 Chapter 3.26 CSE, 2015
26
First Version of Hardware for Division
A comparison requires a subtract; the sign of the result is examined; if the result is negative, the divisor must be added back
EI 209 Chapter 3.27 CSE, 2015
1. Subtract the Divisor register from the Remainder register, and place the result in the Remainder register.
Test RemainderRemainder < 0Remainder >=0
2a. Shift the Quotient register to the left setting the new rightmost bit to 1.
2b. Restore the original value by adding the Divisor reg to the Remainder reg and place the sum in the Remainder reg. Also shift the Quotient register to the left, setting the new LSB to 0
3. Shift the Divisor register right1 bit.
33rd repetition?No: < 33 repetitions
DoneYes: 33 repetitions
StartDivide Algorithm
EI 209 Chapter 3.28 CSE, 2015
28
Divide Example
• Divide 7ten (0000 0111two) by 2ten (0010two)
Iter Step Quot Divisor Remainder
0 Initial values
1
2
3
4
5
EI 209 Chapter 3.29 CSE, 2015
Divide Example
• Divide 7ten (0000 0111two) by 2ten (0010two)
Iter Step Quot Divisor Remainder
0 Initial values 0000 0010 0000 0000 0111
1 Rem = Rem – DivRem < 0 +Div, shift 0 into QShift Div right
000000000000
0010 00000010 00000001 0000
1110 01110000 01110000 0111
2 Same steps as 1 000000000000
0001 00000001 00000000 1000
1111 01110000 01110000 0111
3 Same steps as 1 0000 0000 0100 0000 0111
4 Rem = Rem – Div Rem >= 0 shift 1 into QShift Div right
000000010001
0000 01000000 01000000 0010
0000 00110000 00110000 0011
5 Same steps as 4 0011 0000 0001 0000 0001
EI 209 Chapter 3.30 CSE, 2015
30
Efficient Division
Remainder
Quotient
Divisor
64-bit ALU
Shift Right
Shift Left
Write Control
32 bits
64 bits
64 bits
divisor
32-bit ALU
quotient Control
subtractshiftleft
dividend
remainder
EI 209 Chapter 3.31 CSE, 2015
Left Shift and Subtract Division Hardware
divisor
32-bit ALU
quotient Control
subtractshiftleft
dividend
remainder
0 0 1 0 = 2
0 0 0 0 0 1 1 0 = 60 0 0 0 1 1 0 0
sub 1 1 1 0 1 1 0 0 rem neg, so ‘ient bit = 00 0 0 0 1 1 0 0 restore remainder0 0 0 1 1 0 0 0
sub 1 1 1 1 1 1 0 0 rem neg, so ‘ient bit = 00 0 0 1 1 0 0 0 restore remainder0 0 1 1 0 0 0 0
sub 0 0 0 1 0 0 0 1 rem pos, so ‘ient bit = 10 0 1 0 0 0 1 0
sub 0 0 0 0 0 0 1 1 rem pos, so ‘ient bit = 1= 3 with 0 remainder
EI 209 Chapter 3.32 CSE, 2015
s(0) = z
for j = 1 to k
if 2 s(j-1) - 2k d > 0 qk-j = 1 s(j) = 2 s(j-1) - 2k d else qk-j = 0 s(j) = 2 s(j-1)
32
Restoring Unsigned Integer Division
No need to restore the remainder
in the case of R-D>0,
Restore the remainderIn the case of
R-D<0,
the remainder shift left by 1 bit
K=32, put divisor in the left 32 bit register
EI 209 Chapter 3.33 CSE, 2015
Non-Restoring Unsigned Integer Division
s(1) = 2 z - 2k dfor j = 2 to k if s(j-1) 0 qk-(j-1) = 1 s(j) = 2 s(j-1) - 2k d else qk-(j-1) = 0 s(j) = 2 s(j-1) + 2k dend forif s(k) 0 q0 = 1else q0 = 0 Correction step
If in the last step, remainder –divisor >0,
Perform subtraction
If in the last step, remainder –divisor <0,
Perform addition
why?
EI 209 Chapter 3.34 CSE, 2015
s(0) = z
for j = 1 to k
if 2 s(j-1) - 2k d > 0 qk-j = 1 s(j) = 2 s(j-1) - 2k d else qk-j = 0 s(j) = 2 s(j-1)
s(1) = 2 z - 2k dfor j = 2 to k if s(j-1) 0 qk-(j-1) = 1 s(j) = 2 s(j-1) - 2k d else qk-(j-1) = 0 s(j) = 2 s(j-1) + 2k dend forif s(k) 0 q0 = 1else q0 = 0 Correction step
Restoring Unsigned Integer Division
equal
Why?
Non-Restoring Unsigned Integer Division
considering two consequentsteps j-1 and j, in particular2s(j-2) - 2k d <0
In the j-1 step, Restoring Algorithm computes qk-j = 0 s(j-1) = 2 s(j-2)
Non-Restoring Algorithm s(j-1) = 2 s(j-2) - 2k d
In the subsequent j step, Restoring Algorithm computes 2 s(j-1) - 2k d== 2*2 s(j-2) - 2k d
In the subsequent j step, non-Restoring Algorithm computes 2 s(j-1) + 2k d = 2*2 s(j-2) - 2*2k d +2k d= 2*2 s(j-2) - 2k d
2x-y= 2(x-y)+y
EI 209 Chapter 3.35 CSE, 2015
Non-restoring algorithmset subtract_bit true
1: If subtract bit true:Subtract the Divisor register from the Remainder and place the result
in the remainder register else
Add the Divisor register to the Remainder and place the result in the remainder register
2:If Remainder >= 0Shift the Quotient register to the left, setting rightmost bit to 1
elseSet subtract bit to false
3: Shift the Divisor register right 1 bit if < 33rd rep
goto 1 else Add Divisor register to remainder and place in Remainder register exit
EI 209 Chapter 3.36 CSE, 2015
Example:Perform n + 1 iterations for n bitsRemainder 0000 1011Divisor 00110000-----------------------------------Iteration 1:(subtract)Rem 1101 1011Quotient 0Divisor 0001 1000-----------------------------------Iteration 2:(add)Rem 11110011Q00Divisor 0000 1100-----------------------------------Iteration 3:(add)Rem 11111111Q000Divisor 0000 0110
-----------------------------------Iteration 4:(add)Rem 0000 0101Q0001Divisor 0000 0011-----------------------------------Iteration 5:(subtract)Rem 0000 0010Q 00011Divisor 0000 0001Since reminder is positive, done.Q = 0011 and Rem = 0010
EI 209 Chapter 3.37 CSE, 2015
Exercise
Calculate A divided by B using restoring and non-restoring division. A=26, B=5
EI 209 Chapter 3.38 CSE, 2015
Divide (div and divu) generates the reminder in hi and the quotient in lo
div $s0, $s1 # lo = $s0 / $s1
# hi = $s0 mod $s1
Instructions mfhi rd and mflo rd are provided to move the quotient and reminder to (user accessible) registers in the register file
MIPS Divide Instruction
As with multiply, divide ignores overflow so software must determine if the quotient is too large. Software must also check the divisor to avoid division by 0.
0 16 17 0 0 0x1A
EI 209 Chapter 3.39 CSE, 2015
Lecture 1