Computer Organization and Design Information Encoding - I Montek Singh Mon, Aug 27, 2012 Lecture 2.

Computer Organization and Computer Organization and DesignDesign

Information Encoding - IInformation Encoding - I

Montek SinghMontek SinghMon, Aug 27, 2012Mon, Aug 27, 2012

Lecture 2Lecture 2

Representing InformationRepresenting Information ““Bit JugglingBit Juggling””

Representing Representing information using bitsinformation using bits

Number representationsNumber representations

Reading: Chapter 2.2-Reading: Chapter 2.2-2.32.3

0

00

0

1

11

1

MotivationsMotivations Computers process Computers process informationinformation

information is measured in bitsinformation is measured in bits Computer use binary representationComputer use binary representation

a wire are “hot” or “cold”a wire are “hot” or “cold”a switch is “on” or “off”a switch is “on” or “off”

How do we use/interpret bits?How do we use/interpret bits? We need standards of representations We need standards of representations forforLettersLettersNumbersNumbersColors/pixelsColors/pixelsMusicMusicEtc.Etc.

Today

EncodingEncoding Encoding = assign representation to Encoding = assign representation to informationinformation

Examples:Examples:suppose you have two “things” (symbols) to suppose you have two “things” (symbols) to encodeencodeone is one is ☞☞ and other and other ☜☜what would you do?what would you do?

now suppose you have 4 symbols to encodenow suppose you have 4 symbols to encode 😄 😄 (smiley), (screamie), (confusie), 😱 😖 😪(smiley), (screamie), (confusie), 😱 😖 😪(sleepy)(sleepy)

what would you do?what would you do?now suppose you have the following numbers to now suppose you have the following numbers to encodeencode1, 3, 5 and 71, 3, 5 and 7what would you do?what would you do?

Encoding is an artEncoding is an art Choosing an appropriate and efficient Choosing an appropriate and efficient encoding is a real engineering encoding is a real engineering challenge (and an art)challenge (and an art)

Impacts design at many levelsImpacts design at many levelsMechanism (devices, # of components used)Mechanism (devices, # of components used)Efficiency (bits used)Efficiency (bits used)Reliability (noise)Reliability (noise)Security (encryption)Security (encryption)

Fixed-Length EncodingsFixed-Length Encodings What is fixed-length encoding?What is fixed-length encoding?

all symbols are encoded using the same number all symbols are encoded using the same number of bitsof bits

When to use it?When to use it?of all symbols are equally likely (or we have of all symbols are equally likely (or we have no reason to expect otherwise)no reason to expect otherwise)

When not to use it?When not to use it?when some symbols are more likely, while some when some symbols are more likely, while some are rareare rare

what to use then: variable-length encodingwhat to use then: variable-length encodingexample:example:

suppose X is twice as likely as Y or Zsuppose X is twice as likely as Y or Zhow would we encode them?how would we encode them?

Fixed-Length EncodingsFixed-Length Encodings Length of a fixed-length codeLength of a fixed-length code

use as many bits as needed to unambiguously use as many bits as needed to unambiguously represent all symbolsrepresent all symbols1 bit suffices for 2 symbols1 bit suffices for 2 symbols2 bits suffice for …?2 bits suffice for …?n bits suffice for …?n bits suffice for …?how many bits needed for M symbols?how many bits needed for M symbols?

ex. Decimal digits 10 = {0,1,2,3,4,5,6,7,8,9}ex. Decimal digits 10 = {0,1,2,3,4,5,6,7,8,9}4-bit binary code: 0000 to 10014-bit binary code: 0000 to 1001

ex. ~84 English characters = {A-Z (26), a-z ex. ~84 English characters = {A-Z (26), a-z (26), 0-9 (10), punctuation (8), math (9), (26), 0-9 (10), punctuation (8), math (9), financial (5)}financial (5)}7-bit ASCII (American Standard Code for 7-bit ASCII (American Standard Code for Information Interchange)Information Interchange)

bits7392.6)84(log2

bits4322.3)10(log2

ASCII TableASCII Table

UnicodeUnicode ASCII is biased towards western ASCII is biased towards western languages, esp. Englishlanguages, esp. English

In fact, many more than 256 chars in In fact, many more than 256 chars in common use:common use:

â, m, ö, ñ, è, ¥, â, m, ö, ñ, è, ¥, 揗揗 , , 敇敇 , , 횝횝 , , カカ , , , , ℵ ℷ, , ℵ ℷ жж, , คค

Unicode is a worldwide standard that Unicode is a worldwide standard that supports all languages, special supports all languages, special characters, classic, and arcanecharacters, classic, and arcaneSeveral encoding variants, e.g. 16-bit (UTF-Several encoding variants, e.g. 16-bit (UTF-8)8)

10xxxxxx10zyyyyx11110www 10wwzzzz

0xxxxxxxASCII equiv range:

10xxxxxx110yyyyx16-bit Unicode

10xxxxxx10zyyyyx1110zzzz24-bit Unicode

32-bit Unicode

Encoding Positive IntegersEncoding Positive Integers How to encode positive numbers in How to encode positive numbers in binary?binary?Each number becomes a sequence of 0s and 1sEach number becomes a sequence of 0s and 1sEach bit is assigned a weightEach bit is assigned a weightWeights are increasing powers of 2, right to Weights are increasing powers of 2, right to leftleft

The value of an n-bit number encoded in this The value of an n-bit number encoded in this fashion is given by the following formula:fashion is given by the following formula:

21121029 28 27 26 25 24 23 22 21 20

011111010000 24 = 16

+ 28 = 256

+ 26 = 64+ 27 = 128

+ 29 = 512+ 210 = 1024

200010

Some Bit TricksSome Bit Tricks Get used to working in binaryGet used to working in binary

Specifically for Comp 411, but it will be Specifically for Comp 411, but it will be helpful throughout your career as a computer helpful throughout your career as a computer scientistscientist

Here are some helpful guides Here are some helpful guides

1. Memorize the first 10 powers of 2

20 = 1 25 = 3221 = 2 26 = 6422 = 4 27 = 12823 = 8 28 = 25624 = 16 29 = 512

More Tricks with BitsMore Tricks with Bits Get used to working in binaryGet used to working in binary Here are some helpful guides Here are some helpful guides

2. Memorize the prefixes for powers of 2 that aremultiples of 10

210 = Kilo (1024)220 = Mega (1024*1024)230 = Giga (1024*1024*1024)240 = Tera (1024*1024*1024*1024)250 = Peta (1024*1024*1024 *1024*1024)260 = Exa (1024*1024*1024*1024*1024*1024)

Even More Tricks with BitsEven More Tricks with Bits Get used to working in binaryGet used to working in binary Here are some helpful guides Here are some helpful guides

3. When you convert a binary number to decimal, first break it down into clusters of 10 bits.

4. Then compute the value of the leftmost remaining bits (1) find the appropriate prefix (GIGA) (Often this is sufficient)

5. Compute the value of and add in each remaining 10-bit cluster

0000101000

0000001100

0000000011

01

Other Helpful ClusteringsOther Helpful Clusterings Sometimes convenient to use other Sometimes convenient to use other number “bases”number “bases”often bases are powers of 2: e.g., 8, 16often bases are powers of 2: e.g., 8, 16

allows bits to be clustered into groupsallows bits to be clustered into groupsbase 8 is called base 8 is called octal octal groups of 3 bits groups of 3 bits

Convention: lead the number with a Convention: lead the number with a 0021121029 28 27 26 25 24 23 22 21 20

011111010000

03720

Octal - base 8

000 - 0001 - 1010 - 2011 - 3100 - 4101 - 5110 - 6111 - 7

= 200010

0273

200010

0*80 = 0

+ 3*83 = 1536

+ 2*81 = 16+ 7*82 = 448

One Last ClusteringOne Last Clustering Base 16 is most common!Base 16 is most common!

called called hexadecimal hexadecimal or or hex hex groups of 4 groups of 4 bitsbits

hex ‘digits’ (“hexits”): 0-9, and A-Fhex ‘digits’ (“hexits”): 0-9, and A-Feach hexit position represents a power of 16each hexit position represents a power of 16

Convention: lead with Convention: lead with 0x0x21121029 28 27 26 25 24 23 22 21 20

0111110100000x7d0

Hexadecimal - base 16

0000 - 0 1000 - 80001 - 1 1001 - 90010 - 2 1010 - a0011 - 3 1011 - b0100 - 4 1100 - c0101 - 5 1101 - d0110 - 6 1110 - e0111 - 7 1111 - f

= 200010

0d7

200010

0*160 = 0 + 13*161 = 208+ 7*162 = 1792

Signed-Number Signed-Number RepresentationsRepresentations What about What about signedsigned numbers? numbers?

one obvious idea: use an extra bit to encode the one obvious idea: use an extra bit to encode the signsignconvention: the most significant bit (leftmost) is used convention: the most significant bit (leftmost) is used for the signfor the sign

called the SIGNED MAGNITUDE representationcalled the SIGNED MAGNITUDE representation

S 21029 28 27 26 25 24 23 22 21 20

011111010000

2000

1

-2000

Signed-Number Signed-Number RepresentationsRepresentations The Good: Easy to negate, find The Good: Easy to negate, find absolute valueabsolute value

The Bad:The Bad:add/subtract is complicatedadd/subtract is complicated

depends on the signsdepends on the signs4 different cases!4 different cases!

two different ways of representing a 0two different ways of representing a 0it is not used that frequently in practiceit is not used that frequently in practice

except in floating-point numbersexcept in floating-point numbers

Computer Organization and Design Information Encoding - I Montek Singh Mon, Aug 27, 2012 Lecture 2.

Documents