Birkbeck College, U. London1 Introduction to Computer Systems Lecturer: Steve Maybank Department of Computer Science and Information Systems [email protected].

Birkbeck College, U. London 1

Introduction to Computer Systems

Lecturer: Steve Maybank

Department of Computer Science and Information [email protected]

Autumn 2013

Week 4a: Floating Point Notation for Binary Fractions

22 October 2013

Recap: Binary Numbers In the standard notation for binary

numbers a string of binary digits such as 1001 stands for a sum of powers of 2: 1x23+0x22+0x21+1x20

Binary[1001] and Decimal[9] are different names for the same number

22 October 2013 Birkbeck College, U. London 2

Recap: Binary Addition

22 October 2013 Brookshear, Section 1.5 3

column: 3 2 1 0

1 0 0 1 1 1 ===== 1 1 0 0

There is a carry from column 0 to column 1 andfrom column 1 to column 2.

In tests or examinations, always show the carries.

Recap: Two’s Complement Two’s complement representations

can be added as if they were standard binary numbers.


-3 1101 2 0010== ==== -1 1111

Numbers in Computing


cells in table: all numbers*: numbers that can be stored in memory#: numbers that can be referred to in a program

*#

#

#

* * *#

#

* * #

*#

*

Example: 0.1 cannot be stored in memory in IEEE double precisionfloating point, but the following is a correct Java statementt = 0.1;

Spacing Between Numbers


Two’s complement:equally spacednumbers

0

Floating point:big gaps between big numbers,small gaps between small numbers.

0

The Key: Exponents


2-4 232-22-3 2-1 20 2221

1/16 1/8 ¼ ½ 1 2 4 8

-4 -3 -2 -1 0 1 2 3

big gaps between big numberssmall gaps between small numbers

Brookshear, Section 1.7 8

Example of a Binary Fraction -101.11001 The binary fraction has three parts:

The sign –The position of the radix pointThe bit string 10111001

22 October 2013


Reconstruction of a Binary Fraction

The sign is + The position of the radix point is

just to the right of the second bit from the left

The bit string is 101101 What is the binary fraction?

22 October 2013


Summary To represent a binary fraction three

pieces of information are needed:

Sign Position of the radix point Bit string


Standard Form for a Binary Fraction

Any non-zero binary fraction can be written in the form

±2r x 0.twhere t is a bit string beginning with 1.

Examples11.001 = +22 x 0.11001

-0.011011 = -2-1 x 0.1101122 October 2013


Floating Point Representation Write a non-zero binary fraction in

the form ± 2r x 0.t Record the sign – bit string s1 Record r – bit string s2 Record t – bit string s3 Output s1||s2||s3

22 October 2013


Floating Point Notation 8 bit floating point:

s e1 e2 e3 m1 m2 m3 m4

sign exponent mantissa1 bit 3 bits 4 bits radix r bit string t

The exponent is in 3 bit excess notation


To Find the Floating Point Notation

Write the non-zero number as ± 2r x 0.t

If sign = -1, then s1=1, else s1=0.

s2 = 3 bit excess notation for r.

s3= rightmost four bits of t.


Example b= - 0.00101011101 s=1 b= -2-2 x 0.101011101

exponent = -2, s2 =010 Floating point notation

10101010


Second Example

Floating point notation: 10111100 s1=1, therefore negative. s2 = 011, exponent=-1 s3 = 1100 Binary fraction -0.011 = -3/8


Class Examples Find the floating point

representation of the decimal number -1 1/8

Find the decimal number which has the floating point representation

01101101

22 October 2013


Round-Off Error 2+5/8= 10.101 2 ½ = 10.100 The 8 bit floating point notations for

2 5/8 and 2 ½ are the same: 01101010

The error in approximating 2+5/8 with 10.100 is round-off error or truncation error.

Floating Point Addition Let [x] be the floating point

number closest to the number x. Floating point addition, x@y is

defined byx@y=[[x]+[y]]

Each operation w |-> [w] may introduce round off.



Examples of Floating Point Addition

2 ½: 01101010 1/8: 00101000 ¼: 00111000 2 ¾: 01101011 2 ½ @(1/8 @ 1/8)=2 ½ @ 1/4=2 ¾ (2 ½ @ 1/8) @ 1/8=2 ½ @ 1/8=2 ½


Round-Off in Decimal and Binary

1/5=0.2 exactly in decimal notation 1/5=0.0011001100110011….. in

binary notation 1/5 cannot be represented exactly

in binary floating point no matter how many bits are used.

Round-off is unavoidable but it is reduced by using more bits.


Size of Round-Off Error E(x)

E(x)/x≈α where α is constant. If x > 0, y > 0 and

|x-y|<α x, then x-y cannot be found accurately using

floating point arithmetic.


Examples 4 1/4: 100.01, fpoint = 01111000 4: 100.00, fpoint = 01111000 4 ¼-4 -> fpoint = 00000000

½: 0.1, fpoint = 01001000 ¼: 0.01, fpoint = 00111000 ½-¼ -> fpoint = 00111000


Floating Point Errors

Overflow: number too large to be represented.

Underflow: number <>0 and too small to be represented.

Invalid operation: e.g. SquareRoot[-1].

See http://en.wikipedia.org/wiki/Floating_point

22 October 2013


IEEE Standard for Floating Point Arithmetic

For a general discussion of fp arithmetic seehttp://www.ee.columbia.edu/~marios/matlab/Fall96Cleve.pdf

0 1 … 8 9 … 31

Sign sbit 0

Exponent ebits 1-8

Mantissa mbits 9-31

If 0<e<255, then value = (-1)s x 2e-127 x 1.mIf e=0, s=0, m=0, then value = 0If e=0, s=1, m=0, then value = -0

Single precision, 32 bits.

22 October 2013

Birkbeck College, U. London1 Introduction to Computer Systems Lecturer: Steve Maybank Department of Computer Science and Information Systems [email protected].

Documents

binary floating

string slide

birkbeck college

binary notation

floating point representation

floating point notations

radix point

london2 slide