Birkbeck College, U. London 1 Introduction to Computer Systems Lecturer: Steve Maybank Department of Computer Science and Information Systems [email protected] Autumn 2013 Week 4a: Floating Point Notation for Binary Fractions 22 October 2013
Mar 28, 2015
Birkbeck College, U. London 1
Introduction to Computer Systems
Lecturer: Steve Maybank
Department of Computer Science and Information [email protected]
Autumn 2013
Week 4a: Floating Point Notation for Binary Fractions
22 October 2013
Recap: Binary Numbers In the standard notation for binary
numbers a string of binary digits such as 1001 stands for a sum of powers of 2: 1x23+0x22+0x21+1x20
Binary[1001] and Decimal[9] are different names for the same number
22 October 2013 Birkbeck College, U. London 2
Recap: Binary Addition
22 October 2013 Brookshear, Section 1.5 3
column: 3 2 1 0
1 0 0 1 1 1 ===== 1 1 0 0
There is a carry from column 0 to column 1 andfrom column 1 to column 2.
In tests or examinations, always show the carries.
Recap: Two’s Complement Two’s complement representations
can be added as if they were standard binary numbers.
22 October 2013 Brookshear, Section 1.6 4
-3 1101 2 0010== ==== -1 1111
Numbers in Computing
22 October 2013 Birkbeck College, U. London 5
cells in table: all numbers*: numbers that can be stored in memory#: numbers that can be referred to in a program
*#
#
#
* * *#
#
* * #
*#
*
Example: 0.1 cannot be stored in memory in IEEE double precisionfloating point, but the following is a correct Java statementt = 0.1;
Spacing Between Numbers
22 October 2013 Birkbeck College, U. London 6
Two’s complement:equally spacednumbers
0
Floating point:big gaps between big numbers,small gaps between small numbers.
0
The Key: Exponents
22 October 2013 Birkbeck College, U. London 7
2-4 232-22-3 2-1 20 2221
1/16 1/8 ¼ ½ 1 2 4 8
-4 -3 -2 -1 0 1 2 3
big gaps between big numberssmall gaps between small numbers
Brookshear, Section 1.7 8
Example of a Binary Fraction -101.11001 The binary fraction has three parts:
The sign –The position of the radix pointThe bit string 10111001
22 October 2013
Brookshear, Section 1.7 9
Reconstruction of a Binary Fraction
The sign is + The position of the radix point is
just to the right of the second bit from the left
The bit string is 101101 What is the binary fraction?
22 October 2013
22 October 2013 Brookshear, Section 1.7 10
Summary To represent a binary fraction three
pieces of information are needed:
Sign Position of the radix point Bit string
Brookshear, Section 1.7 11
Standard Form for a Binary Fraction
Any non-zero binary fraction can be written in the form
±2r x 0.twhere t is a bit string beginning with 1.
Examples11.001 = +22 x 0.11001
-0.011011 = -2-1 x 0.1101122 October 2013
Brookshear, Section 1.7 12
Floating Point Representation Write a non-zero binary fraction in
the form ± 2r x 0.t Record the sign – bit string s1 Record r – bit string s2 Record t – bit string s3 Output s1||s2||s3
22 October 2013
22 October 2013 Brookshear, Section 1.7 13
Floating Point Notation 8 bit floating point:
s e1 e2 e3 m1 m2 m3 m4
sign exponent mantissa1 bit 3 bits 4 bits radix r bit string t
The exponent is in 3 bit excess notation
22 October 2013 Brookshear, Section 1.7 14
To Find the Floating Point Notation
Write the non-zero number as ± 2r x 0.t
If sign = -1, then s1=1, else s1=0.
s2 = 3 bit excess notation for r.
s3= rightmost four bits of t.
22 October 2013 Birkbeck College, U. London 15
Example b= - 0.00101011101 s=1 b= -2-2 x 0.101011101
exponent = -2, s2 =010 Floating point notation
10101010
22 October 2013 Birkbeck College, U. London 16
Second Example
Floating point notation: 10111100 s1=1, therefore negative. s2 = 011, exponent=-1 s3 = 1100 Binary fraction -0.011 = -3/8
Birkbeck College, U. London 17
Class Examples Find the floating point
representation of the decimal number -1 1/8
Find the decimal number which has the floating point representation
01101101
22 October 2013
22 October 2013 Brookshear, Section 1.7 18
Round-Off Error 2+5/8= 10.101 2 ½ = 10.100 The 8 bit floating point notations for
2 5/8 and 2 ½ are the same: 01101010
The error in approximating 2+5/8 with 10.100 is round-off error or truncation error.
Floating Point Addition Let [x] be the floating point
number closest to the number x. Floating point addition, x@y is
defined byx@y=[[x]+[y]]
Each operation w |-> [w] may introduce round off.
22 October 2013 Birkbeck College, U. London 19
22 October 2013 Birkbeck College, U. London 20
Examples of Floating Point Addition
2 ½: 01101010 1/8: 00101000 ¼: 00111000 2 ¾: 01101011 2 ½ @(1/8 @ 1/8)=2 ½ @ 1/4=2 ¾ (2 ½ @ 1/8) @ 1/8=2 ½ @ 1/8=2 ½
22 October 2013 Birkbeck College, U. London 21
Round-Off in Decimal and Binary
1/5=0.2 exactly in decimal notation 1/5=0.0011001100110011….. in
binary notation 1/5 cannot be represented exactly
in binary floating point no matter how many bits are used.
Round-off is unavoidable but it is reduced by using more bits.
22 October 2013 Birkbeck College, U. London 22
Size of Round-Off Error E(x)
E(x)/x≈α where α is constant. If x > 0, y > 0 and
|x-y|<α x, then x-y cannot be found accurately using
floating point arithmetic.
22 October 2013 Birkbeck College, U. London 23
Examples 4 1/4: 100.01, fpoint = 01111000 4: 100.00, fpoint = 01111000 4 ¼-4 -> fpoint = 00000000
½: 0.1, fpoint = 01001000 ¼: 0.01, fpoint = 00111000 ½-¼ -> fpoint = 00111000
Birkbeck College, U. London 24
Floating Point Errors
Overflow: number too large to be represented.
Underflow: number <>0 and too small to be represented.
Invalid operation: e.g. SquareRoot[-1].
See http://en.wikipedia.org/wiki/Floating_point
22 October 2013
Birkbeck College, U. London 25
IEEE Standard for Floating Point Arithmetic
For a general discussion of fp arithmetic seehttp://www.ee.columbia.edu/~marios/matlab/Fall96Cleve.pdf
0 1 … 8 9 … 31
Sign sbit 0
Exponent ebits 1-8
Mantissa mbits 9-31
If 0<e<255, then value = (-1)s x 2e-127 x 1.mIf e=0, s=0, m=0, then value = 0If e=0, s=1, m=0, then value = -0
Single precision, 32 bits.
22 October 2013