Lecture 12: Integer Arithmetic and Floating Point
CS 2011
Fall 2014, Dr. Rozier
FULL ADDER SOLUTIONS
INTEGER ARITHMETIC
Putting Together Multiple Bits
Making it Faster
Carry Look Ahead Adder
Making it Even Faster
Carry-Select Adder
Kogge-Stone Adder
How do we get subtraction?X B2T(X)B2U(X)
0000 00001 10010 20011 30100 40101 50110 60111 7
–88–79–610–511–412–313–214–115
10001001101010111100110111101111
01234567
How do we get subtraction?X B2T(X)B2U(X)
0000 00001 10010 20011 30100 40101 50110 60111 7
–88–79–610–511–412–313–214–115
10001001101010111100110111101111
01234567
1 0 0 1 0 11 1 x
0 1 1 0 1 00 0~x+
1 1 1 1 1 11 1-1
FLOATING POINT
Carnegie Mellon
Fractional binary numbers
• What is 1011.1012?
2i
2i-1
421
1/21/41/8
2-j
bibi-
1
•••
b2 b1 b0 b-1 b-2 b-3•••
b-j
Carnegie Mellon
• • •
Fractional Binary Numbers
• Representation– Bits to right of “binary point” represent fractional powers of
2– Represents rational number:
• • •
Carnegie Mellon
Fractional Binary Numbers: Examples
Value Representation5 3/4 101.112
2 7/8 010.1112
63/64 001.01112
Observations Divide by 2 by shifting right Multiply by 2 by shifting left Numbers of form 0.111111…2 are just below 1.0
1/2 + 1/4 + 1/8 + … + 1/2i + … 1.0➙ Use notation 1.0 – ε
Carnegie Mellon
Representable Numbers
• Limitation– Can only exactly represent numbers of the form x/2k
– Other rational numbers have repeating bit representations
• Value Representation– 1/3 0.0101010101[01]…2
– 1/5 0.001100110011[0011]…2
– 1/10 0.0001100110011[0011]…2
Floating Point Standard
• Defined by IEEE Std 754-1985• Developed in response to divergence of
representations– Portability issues for scientific code
• Now almost universally adopted• Two representations
– Single precision (32-bit)– Double precision (64-bit)
IEEE Floating-Point Format
• S: sign bit (0 non-negative, 1 negative)• Normalize significand: 1.0 ≤ |significand| < 2.0
– Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit)
– Significand is Fraction with the “1.” restored• Exponent: excess representation: actual exponent + Bias
– Ensures exponent is unsigned– Single: Bias = 127; Double: Bias = 1203
S Exponent Fraction
single: 8 bitsdouble: 11 bits
single: 23 bitsdouble: 52 bits
Bias)(ExponentS 2Fraction)(11)(x
Floating-Point Addition
• Consider a 4-digit decimal example– 9.999 × 101 + 1.610 × 10–1
• 1. Align decimal points– Shift number with smaller exponent– 9.999 × 101 + 0.016 × 101
• 2. Add significands– 9.999 × 101 + 0.016 × 101 = 10.015 × 101
• 3. Normalize result & check for over/underflow– 1.0015 × 102
• 4. Round and renormalize if necessary– 1.002 × 102
Floating-Point Addition
• Now consider a 4-digit binary example– 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)
• 1. Align binary points– Shift number with smaller exponent– 1.0002 × 2–1 + –0.1112 × 2–1
• 2. Add significands– 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
• 3. Normalize result & check for over/underflow– 1.0002 × 2–4, with no over/underflow
• 4. Round and renormalize if necessary– 1.0002 × 2–4 (no change) = 0.0625
FP Adder Hardware
• Much more complex than integer adder• Doing it in one clock cycle would take too long
– Much longer than integer operations– Slower clock would penalize all instructions
• FP adder usually takes several cycles– Can be pipelined
FP Adder Hardware
Step 1
Step 2
Step 3
Step 4
FP Arithmetic Hardware
• FP multiplier is of similar complexity to FP adder– But uses a multiplier for significands instead of an
adder• FP arithmetic hardware usually does
– Addition, subtraction, multiplication, division, reciprocal, square-root
– FP integer conversion• Operations usually takes several cycles
– Can be pipelined
Floating Point
• Floating Point is handled by a FPU, floating point unit.
Pentium FDIV Bug
• Intel’s Pentium 5– Professor Thomas Nicely noticed inconsistencies in
calculations when addingPentiums to his cluster
– Floating-point divisionoperations didn’t quite comeout right.Off by 61 parts per million
Pentium FDIV Bug
• Intel acknowledged the flaw, but claimed it wasn’t serious. Wouldn’t affect most users.
• Byte magazine estimatedonly 1 in 9 billion floatingpoint operations wouldsuffer the error.
Pentium FDIV Bug
• Total cost to Intel?
$450 million
WRAP UP
For next time
• Read Chapter 4.1-4.4
For next time
• Read Chapter 3 • Sections 3.1 – 3.5