1 Lecture 3 Bit Operations Floating Point – 32 bits or 64 bits 1
1
Lecture 3 Lecture 3
Bit Operations
Floating Point – 32 bits or 64 bits
1
DCO2
Bit Operations
AND &
OR |
ONE'S COMPLEMENT ~
EXCLUSIVE OR ^
SHIFT (right) >>
SHIFT (left) <<
DCO3
Operation - examples
AND 1 & 1 = 1; 1& 0 = 0
OR 1 |1 = 1; 1| 0 = 1; 0|0 = 0
~ 0 =~1; 1 =~0;
^ 0^ 0 = 0; 1^1 = 0; 1^0 =1; 0^1 = 1
>> 0x010 = 0x001 <<1
<< 0x001 = 0x010 >>1
DCO4
AND Example
1111 0010 (0xf2)1111 1110 (0xfe)---------------- (and) &1111 0010 (0xf2)char c = 0xf2;char d = 0xfe;Char e = c & d; //e is 0xf2
DCO5
OR Example
1111 0010 (0xf2)1111 1110 (0xfe)--------------(or) |1111 1110 (0xfe)
char c = 0xf2;char d = 0xfe;char e = c | d; //e is 0xfe
DCO6
One’s complement
1111 0010 (0xf2)
-------------- ~0000 1101 (0x0d)
char c = 0xf2;char e = ~c; //e is 0x0d
DCO7
EXCLUSIVE OR
1111 0010 (0xf2)1111 1110 (0xfe)-------------- (^) 0000 1100 (0x0c)
char c = 0xf2;char d = 0xfe;char e = c ^ d; //e is 0x0c
DCO8
SHIFT >> 1 - (right) by one bit
1111 0010 (0xf2)>> 1 (shift right by one bit)---------------------0111 10001 (0x79)
char c = 0xf2;char e = c >>1; //e is 0x79
DCO9
SHIFT >> 2 - by two bits
1111 0010 (0xf2)>> 2 (shift right by one bit)---------------------0011 1100 (0x3c)
char c = 0xf2;char e = c >>2; //e is 0x3c
DCO10
SHIFT << 1 - (left) by one bit
1111 0010 (0xf2)<< 1 (shift right by one bit)---------------------1110 0100 (0xe4)
char c = 0xf2;char e = c <<1; //e is 0xe4
DCO11
SHIFT << 2 - by two bits
1111 0010 (0xf2)>> 2 (shift right by one bit)---------------------1100 1000 (0xc8)
char c = 0xf2;char e = c <<2; //e is 0xc8
DCO12
Results or bit operation (example)
(1 | 2) == 3 (1 | 3) == 3 (1 & 2) == 0 (1 & 3) == 1 (2 & 3) == 2 (0 ^ 3) == 3 (1 ^ 3) == 2 (2 ^ 3) == 1 (3 ^ 3) == 0 ~0 == -1 (signed) or 255 (unsigned). ~23 == -24 (signed) or 232 (unsigned).
DCO13
Integer
Formats, Signs, and Precision
Overflow
DCO14
Data representation
150 (decimal) is
128(2^7) + 16(2^4) + 4(2^2) + 2(2^1). We can rewrite this as:
1*128 + 0*64 + 0*32 + 1*16 + 0*8 + 1*4 + 1*2 + 0*1 or
1001 0110 (in hex), In 8 bits, for 32 bits, 2^32
DCO15
Format
In C, the size of integer types (short, int, long, etc.) is implementation dependent, but on most machines:
chars are 8 bits
ints and longs are 32 bits
shorts are 16 bits
DCO16
One’s complement & two’s complement
1001 0010 //original data0110 1101 //reverse 1 to 0, 0 to 1
One’s complement
0110 1101 +0000 0001 (add one to one’s complement)
0110 1110 Two’s complement
DCO17
Application of twos’ complement
1100 0011 – 0011 0001
-0011 0001 can be converted into two’s complement
The operation becomes addition instead of subtraction
Two’s complement –0011 0001 is 1100 1111
1100 0011 – 0011 0001 is
1100 0011 + 1100 1111 =0001 0010Convert “-” to “+”
DCO18
Signed integer of 100
Negative sign is represented by 1 (most significant bit)100 (decimal) 0x0064 (hex)
0000 0000 0110 0100
positive
-100 (decimal) 0xff9c (hex)
1111 1111 1001 1100
negative
DCO19
Example of – 101 (decimal)
101 (decimal ) is 0x0065 (hex)
0000 0000 0110 0101 ( 32 bits integer)
-101 is two’s complement
0xff9bConvert 0 to 1, 1 to 0 and then add 1
0000 0000 0110 0101 (101 in decimal)
1111 1111 1001 1010 (convert 1 to 0 and 0 to 1)
1111 1111 1001 1011 (add 1, result)
DCO20
Left & right Shift
101 (dec) is 0x0065 (hex)
Shift right by one bit >>1
0000 0000 0110 0101 (101 dec)
>> 1 (shift left)
0000 0000 0011 0010 (50 dec), not 50.5
DCO21
Overflow
When integers are too big to fit into a word, overflow occurs.
16-bit unsigned integer, the largest is 32767
If you add 32767 + 1 (dec), it produces 0
1111 1111 1111 1111 (binary)
+
0000 0000 0000 0001
1 0000 0000 0000 0000 (overflow)
DCO22
Floating
Fixed-Point Representations – 32.45
Floating-Point Representations such as 32.45, 3.245 x 10, 0.3245 x 100
Normalization and Hidden Bits
IEEE Floating Point
Details
DCO23
Fixed-Point Representations
Such as 123.45
1234.78
The decimal point is fixed
DCO24
Floating point
A floating-point number is really two numbers packed into one string of bits. One of the numbers is simply a fixed-point binary number, which is called the mantissa. The other number acts to shift the binary point left or right to keep the mantissa in a useful range. This number is called the exponent.
DCO25
Example of Floating
1.234
78.945
12.56
4.5 E10
4.5 x 10^4
DCO26
Expression1 bit sign bit, 8 bit exponent, and 23 bit Mantissa (total 32 bits)
-1^Sign * 2^(Exponent - 127) * (1 + Mantissa * 2^-23)
Zero, sign bit is 0, Negative, sign bit is 1
Exponent is unsigned, minus 127. That is if the value is 128, it means 128 – 127 = 1, if the value is 256, it means 256 – 127 = 128, or the value is zero, it means 0 – 127 = -127.
DCO27
Example
DCO28
Expression - Mantissa
Mantissa is unsigned bit, the expression is 1 + Mantissa * 2^(-23)
If Mantissa is zero, the expression becomes 1 + 0 * 2^(-23) = 1If Mantissa is 2^23, the expression becomes 1 + 2^23 * 2 ^(-23) = 1 + 1 = 2
It ranges from 1 to 2
DCO29
Example
2.5 (floating point)
0100 0000 0010 0000 0000 0000 0000 0000
Sign: positive (1)
Exponent : 1000 0000 : 128 (128 – 127 = 1)
Mantissa: 1. 010 0000 0000 0000 0000 0000, 1.25
Result 1 x 1.25 x 2^1 = 2.5
DCO30
Example
Determine the values of
1011 1101 0100 0000 0000 0000 0000 0000
Sign = 1, negative
Exponent = 122 so exponent = -5 (122 – 127)
Mantissa: 100 0000 0000 0000 0000 0000 = 1.1
So result –1 x 1.1 x 2^(-5) 2 = -0.046875 10
(decimal)
DCO31
Two representation of Zero
When the sign bit is zero, positive, exponent is zero and Mantissa is zero
When the sign bit is 1, negative, exponent is zero and Mantissa is zero
In short, there are two formats of ZERO
DCO32
IEEE Double Precision – 64 bits
64 bits
with
11 exponent bits
52 mantissa bits
Plus one bit sign bit
C++ uses 32 bits (4 bytes) instead of 64 bits
DCO33
Summary
Bit operation, &, |, ~, >> or <<
Representation of data, int, short, long
Floating point – sign, magnitude, mantissa for example, -3.4 x10^4
- (negative), 3.5 (mantissa), 4 magnitude
0x00341256 can be an integer, floating point, or a statement. That is why we need to
define
int i;Char a;