1 Lecture 3 Bit Operations Floating Point – 32 bits or 64 bits 1.

1

Lecture 3 Lecture 3

Bit Operations

Floating Point – 32 bits or 64 bits

1

DCO2

Bit Operations

AND &

OR |

ONE'S COMPLEMENT ~

EXCLUSIVE OR ^

SHIFT (right) >>

SHIFT (left) <<

DCO3

Operation - examples

AND 1 & 1 = 1; 1& 0 = 0

OR 1 |1 = 1; 1| 0 = 1; 0|0 = 0

~ 0 =~1; 1 =~0;

^ 0^ 0 = 0; 1^1 = 0; 1^0 =1; 0^1 = 1

>> 0x010 = 0x001 <<1

<< 0x001 = 0x010 >>1

DCO4

AND Example

1111 0010 (0xf2)1111 1110 (0xfe)---------------- (and) &1111 0010 (0xf2)char c = 0xf2;char d = 0xfe;Char e = c & d; //e is 0xf2

DCO5

OR Example

1111 0010 (0xf2)1111 1110 (0xfe)--------------(or) |1111 1110 (0xfe)

char c = 0xf2;char d = 0xfe;char e = c | d; //e is 0xfe

DCO6

One’s complement

1111 0010 (0xf2)

-------------- ~0000 1101 (0x0d)

char c = 0xf2;char e = ~c; //e is 0x0d

DCO7

EXCLUSIVE OR

1111 0010 (0xf2)1111 1110 (0xfe)-------------- (^) 0000 1100 (0x0c)

char c = 0xf2;char d = 0xfe;char e = c ^ d; //e is 0x0c

DCO8

SHIFT >> 1 - (right) by one bit

1111 0010 (0xf2)>> 1 (shift right by one bit)---------------------0111 10001 (0x79)

char c = 0xf2;char e = c >>1; //e is 0x79

DCO9

SHIFT >> 2 - by two bits

1111 0010 (0xf2)>> 2 (shift right by one bit)---------------------0011 1100 (0x3c)

char c = 0xf2;char e = c >>2; //e is 0x3c

DCO10

SHIFT << 1 - (left) by one bit

1111 0010 (0xf2)<< 1 (shift right by one bit)---------------------1110 0100 (0xe4)

char c = 0xf2;char e = c <<1; //e is 0xe4

DCO11

SHIFT << 2 - by two bits

1111 0010 (0xf2)>> 2 (shift right by one bit)---------------------1100 1000 (0xc8)

char c = 0xf2;char e = c <<2; //e is 0xc8

DCO12

Results or bit operation (example)

(1 | 2) == 3 (1 | 3) == 3 (1 & 2) == 0 (1 & 3) == 1 (2 & 3) == 2 (0 ^ 3) == 3 (1 ^ 3) == 2 (2 ^ 3) == 1 (3 ^ 3) == 0 ~0 == -1 (signed) or 255 (unsigned). ~23 == -24 (signed) or 232 (unsigned).

DCO13

Integer

Formats, Signs, and Precision

Overflow

DCO14

Data representation

150 (decimal) is

128(2^7) + 16(2^4) + 4(2^2) + 2(2^1). We can rewrite this as:

1*128 + 0*64 + 0*32 + 1*16 + 0*8 + 1*4 + 1*2 + 0*1 or

1001 0110 (in hex), In 8 bits, for 32 bits, 2^32

DCO15

Format

In C, the size of integer types (short, int, long, etc.) is implementation dependent, but on most machines:

chars are 8 bits

ints and longs are 32 bits

shorts are 16 bits

DCO16

One’s complement & two’s complement

1001 0010 //original data0110 1101 //reverse 1 to 0, 0 to 1

One’s complement

0110 1101 +0000 0001 (add one to one’s complement)

0110 1110 Two’s complement

DCO17

Application of twos’ complement

1100 0011 – 0011 0001

-0011 0001 can be converted into two’s complement

The operation becomes addition instead of subtraction

Two’s complement –0011 0001 is 1100 1111

1100 0011 – 0011 0001 is

1100 0011 + 1100 1111 =0001 0010Convert “-” to “+”

DCO18

Signed integer of 100

Negative sign is represented by 1 (most significant bit)100 (decimal) 0x0064 (hex)

0000 0000 0110 0100

positive

-100 (decimal) 0xff9c (hex)

1111 1111 1001 1100

negative

DCO19

Example of – 101 (decimal)

101 (decimal ) is 0x0065 (hex)

0000 0000 0110 0101 ( 32 bits integer)

-101 is two’s complement

0xff9bConvert 0 to 1, 1 to 0 and then add 1

0000 0000 0110 0101 (101 in decimal)

1111 1111 1001 1010 (convert 1 to 0 and 0 to 1)

1111 1111 1001 1011 (add 1, result)

DCO20

Left & right Shift

101 (dec) is 0x0065 (hex)

Shift right by one bit >>1

0000 0000 0110 0101 (101 dec)

>> 1 (shift left)

0000 0000 0011 0010 (50 dec), not 50.5

DCO21

Overflow

When integers are too big to fit into a word, overflow occurs.

16-bit unsigned integer, the largest is 32767

If you add 32767 + 1 (dec), it produces 0

1111 1111 1111 1111 (binary)

+

0000 0000 0000 0001

1 0000 0000 0000 0000 (overflow)

DCO22

Floating

Fixed-Point Representations – 32.45

Floating-Point Representations such as 32.45, 3.245 x 10, 0.3245 x 100

Normalization and Hidden Bits

IEEE Floating Point

Details

DCO23

Fixed-Point Representations

Such as 123.45

1234.78

The decimal point is fixed

DCO24

Floating point

A floating-point number is really two numbers packed into one string of bits. One of the numbers is simply a fixed-point binary number, which is called the mantissa. The other number acts to shift the binary point left or right to keep the mantissa in a useful range. This number is called the exponent.

DCO25

Example of Floating

1.234

78.945

12.56

4.5 E10

4.5 x 10^4

DCO26

Expression1 bit sign bit, 8 bit exponent, and 23 bit Mantissa (total 32 bits)

-1^Sign * 2^(Exponent - 127) * (1 + Mantissa * 2^-23)

Zero, sign bit is 0, Negative, sign bit is 1

Exponent is unsigned, minus 127. That is if the value is 128, it means 128 – 127 = 1, if the value is 256, it means 256 – 127 = 128, or the value is zero, it means 0 – 127 = -127.

DCO27

Example

DCO28

Expression - Mantissa

Mantissa is unsigned bit, the expression is 1 + Mantissa * 2^(-23)

If Mantissa is zero, the expression becomes 1 + 0 * 2^(-23) = 1If Mantissa is 2^23, the expression becomes 1 + 2^23 * 2 ^(-23) = 1 + 1 = 2

It ranges from 1 to 2

DCO29

Example

2.5 (floating point)

0100 0000 0010 0000 0000 0000 0000 0000

Sign: positive (1)

Exponent : 1000 0000 : 128 (128 – 127 = 1)

Mantissa: 1. 010 0000 0000 0000 0000 0000, 1.25

Result 1 x 1.25 x 2^1 = 2.5

DCO30

Example

Determine the values of

1011 1101 0100 0000 0000 0000 0000 0000

Sign = 1, negative

Exponent = 122 so exponent = -5 (122 – 127)

Mantissa: 100 0000 0000 0000 0000 0000 = 1.1

So result –1 x 1.1 x 2^(-5) 2 = -0.046875 10

(decimal)

DCO31

Two representation of Zero

When the sign bit is zero, positive, exponent is zero and Mantissa is zero

When the sign bit is 1, negative, exponent is zero and Mantissa is zero

In short, there are two formats of ZERO

DCO32

IEEE Double Precision – 64 bits

64 bits

with

11 exponent bits

52 mantissa bits

Plus one bit sign bit

C++ uses 32 bits (4 bytes) instead of 64 bits

DCO33

Summary

Bit operation, &, |, ~, >> or <<

Representation of data, int, short, long

Floating point – sign, magnitude, mantissa for example, -3.4 x10^4

- (negative), 3.5 (mantissa), 4 magnitude

0x00341256 can be an integer, floating point, or a statement. That is why we need to

define

int i;Char a;

1 Lecture 3 Bit Operations Floating Point – 32 bits or 64 bits 1.

Documents