Top Banner
• Office hours shifted: 2-4 Tuesdays • Summary & Discussion of Gell- Mann • Definitions & Examples – Shannon Information – Mutual Information – Kolmogorov Complexity (AIC) • Turing Machine – Effective Complexity
26

Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Feb 23, 2016

Download

Documents

NARA

Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples Shannon Information Mutual Information Kolmogorov Complexity (AIC) Turing Machine Effective Complexity. Shannon Information. Shannon Entropy H to measure basic information capacity: - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

• Office hours shifted: 2-4 Tuesdays• Summary & Discussion of Gell-Mann• Definitions & Examples– Shannon Information– Mutual Information– Kolmogorov Complexity (AIC)• Turing Machine

– Effective Complexity

Page 2: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Shannon Information• Shannon Entropy H to measure basic information capacity:

– For a random variable X with a probability mass function p(x), the entropy of X is defined as:

– Entropy is measured in bits. – H measures the average uncertainty in the random variable.

• Example 1: – Consider a random variable with uniform distribution over 32

outcomes.– To identify an outcome, we need a label that takes on 32 different

values, e.g., 5-bit strings.

)(log)()( 2 xpxpXH ∑−=

bits 532log321log

321)(log)()(

32

1

32

1

==−=−= ∑ ∑= =i i

ipipXH

Page 3: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

What is a Random Variable?• A function defined on a sample space.

– Should be called “random function.” – Independent variable is a point in a sample space (e.g., the outcome of an

experiment).• A function of outcomes, rather than a single given outcome. • Probability distribution of the random variable X:

• Example:– Toss 3 fair coins.– Let X denote the number of heads appearing.– X is a random variable taking on one of the values (0,1,2,3).– P{X=0} = 1/8; P{X=1} = 3/8; P{X=2} = 3/8; P{X=3} = 1/8.

1,2,...)(j )(}{ === jj xfxXP

Page 4: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

• Example 2:– A horse race with 8 horses competing.– The probabilities of 8 horses are:

– Calculate the entropy H of the horse race:

– Suppose that we wish to send a (short) message to another person indicating which horse won the race.

– Could send the index of the winning horse (3 bits).– Alternatively, could use the following set of labels:

• 0, 10, 110, 1110, 111100, 111101, 111110, 111111.• Average description length is 2 bits (instead of 3).• Huffman code: variable length ‘prefix free’ code for maximum lossless

compression

⎟⎠⎞⎜⎝

⎛641,

641,

641,

641,

161,

81,

41,

21

bits 2641log

6414

161log

161

81log

81

41log

41

21log

21)( =−−−−−=XH

Page 5: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

• More generally, the entropy of a random variable is a lower bound on the average number of bits required to represent the random variable.

• Shannon information is the amount of ‘surprise’ when you read a message• The amount of uncertainty you have before you read the message—

information measures how much you’ve reduced that uncertainty by reading the message.

• The uncertainty (complexity) of a random variable can be extended to define the descriptive complexity of a single string.– E.g., Kolmogorov (or algorithmic) complexity is the length of the shortest

computer program that prints out the string.• Entropy is the uncertainty of a single random variable.• Conditional entropy is the entropy of a random variable given another

random variable.

Page 6: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Mutual Information• Measures the amount of information that one random variable contains

about another random variable.– Mutual information is a measure of reduction of uncertainty due to another

random variable.– That is, mutual information measures the dependence between two random

variables.– It is symmetric in X and Y, and is always non-negative.

• Recall: Entropy of a random variable X is H(X).• Conditional entropy of a random variable X given another random

variable Y = H(X | Y).• The mutual information of two random variables X and Y is:

∑=−=yx ypxp

yxpyxpYXHXHYXI, )()(

),(log),()|()(),(

H(X) H(Y)H(X|Y)

Page 7: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Algorithmic Complexity (AIC)(also known as Kolmogorov-Chaitin complexity)

• Kolomogorov-Chaitin complexity or Algorithmic Information Content, K(x) or KU(x), is the length, in bits, of the smallest program that when run on a Universal Turing Machine outputs (prints) x and then halts.

• Example: What is K(x) where x is the first 10 even natural numbers? Where x is the first 5 million even natural numbers?

• Possible representations:– 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, … (2n - 2) K(x) = O(n log n)

bits– for (j = 0; j < n: j++) printf(“%d\n”, j * 2) K(x) = O(log n)

Page 8: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Two problems with AIC

– Calculation of K(x) depends on the machine we have available (e.g., what if we have a machine with an instruction “print the first 10 even natural numbers”?)

– Determining K(x) for arbitrary x is uncomputable

Page 9: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

AIC cont.• AIC formalizes what it means for a set of numbers to be

compressible and incompressible.– Data that are redundant can be more easily described and have lower

AC.– Data that have no clear pattern and no easy algorithmic description

have high AIC.• Random strings are incompressible, therefore contain no

regularities to compress– K(x) = | Print(x) |

• Implication: The more random a system, the greater its AIC.• Contrast with Statistical simplicity

Random strings are simple because you can approximate them statistically– Coin toss, random walks, Gaussian (normal distributions)– You can compress random numbers with statistical descriptions and

only a few parameters

Page 10: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Gell-Mann’s Effective Complexity• The length of the shortest description of a set’s regularities• EC(x) = K(r) where r is the set of regularities in x

and Kolmogorov Complexity, K(r), is the length of a concise description of a set

• Highest for entities that are not strictly regular or random

Effe

ctiv

e C

ompl

exity

Randomness h mRandomness h m

Alg

orith

mic

C

ompl

exity

Low Shannon InfoHigh compressibilityOrderly

High Shannon InfoLow compressibility*Random

Page 11: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

– Gell Mann suggests one formal way to identify regularities– Determine mutual AIC between parts of the string• If x = [x1,x2] • K(x1,x2) = K(x1) + K(x2) – K (x)• The sum of the AICs of the parts – the AIC of the whole• Eg. 10010 10011 10010 -the whole has more regularity

than the sum of the regularities in the parts

Page 12: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Total Information– Alternative approach in Gell-Mann & Lloyd 1998– EC(x) = K(E) where E is the set of entities of which x is a

typical member– Then K(x) is the length of the shortest program required to

specify the members of a the set of which x is a typical member

– Effective complexity measures knowledge—the extent to which the entity is nonrandom and predictable

– Total Information is Effective complexity, K(E), + the Shannon Information (peculiarities, randomness)

– TI(x) = K(E) + H(x)– There is a tradeoff between the effective complexity (how

complete a description of the regularities) and the remaining randomness

– Ex: 10010 10011 10010 10010 10010

Page 13: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Language Complexity

• How complex is the English (or Spanish or Chinese or Thai or C programming or Java…) language?

• What is the Shannon information content oof a message in this language, as a function of the message length?

• What is the effective complexity (AIC of the regularities?

• What is the total information content?• How well do these measures approximate the

complexity you perceive in these languages?

Page 14: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

What is the information content, AIC, Effective Complexity & logical depth of this string?

04134330211314737138689744023948013817165984855189815134408627142027932522312442988890890859944935463236713411532481714219947455 6443658237932020095610583305754586176522220703854106467494942849 8145339172620056875566595233987560382563722564800409510712838906 1184470277585428541980111344017500242858538249833571552205223608 7250291678860362674527213399057131606875345083433934446103706309 4520191158769724322735898389037949462572512890979489867683346116 2688911656312347446057517953912204556247280709520219819909455858 1946136877445617396074115614074243754435499204869180982648652368 4387027996490173977934251347238087371362116018601281861020563818 1835409759847796417390032893617143215987824078977661439139576403 7760537119096932066998361984288981837003229412030210655743295550 3888458497370347275321219257069584140746618419819610061296401614 8771294441590140546794180019813325337859249336588307045999993837 5411726563553016862529032210862320550634510679399023341675

Page 15: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Summary of Complexity Measures• Information-theoretic methods:

– Shannon Entropy– Algorithmic complexity– Mutual information

• Effective Complexity: – Neither regular nor random

• Computational complexity:– How many resources does it take to compute a function?

• The language/machine hierarchy:– How complex a machine is needed to compute a function?

• Logical depth:– Run-time of the shortest program that generates the phenomena and halts.

• Asymptotic behavior of dynamical systems:– Fixed points, limit cycles, chaos.– Wolfram’s CA classification: the outcome of complex CA can not be predicted any faster than it can

be simulated.

Page 16: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Frozen Accidents & Sensitive dependence on initial conditions

For Want of a Nail

For want of a nail the shoe was lost.For want of a shoe the horse was lost.For want of a horse the rider was lost.For want of a rider the battle was lost.

For want of a battle the kingdom was lost.And all for the want of a horseshoe nail.

Page 17: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Hierarchy, Interactions, Frozen Accidents

HELA: An exponential that has lasted forever

Under the microscope, a cell looks a lot like a fried egg: It has a white (the cytoplasm) that's full of water and proteins to keep it fed, and a yolk (the nucleus) that holds all the genetic information that makes you you. The cytoplasm buzzes like a New York City street. It's crammed full of molecules and vessels endlessly shuttling enzymes and sugars from one part of the cell to another, pumping water, nutrients, and oxygen in and out of the cell. All the while, little cytoplasmic factories work 24/7, cranking out sugars, fats, proteins, and energy to keep the whole thing running and feed the nucleus. The nucleus is the brains of the operation; inside every nucleus within each cell in your body, there's an identical copy of your entire genome. That genome tells cells when to grow and divide and makes sure they do their jobs, whether that's controlling your heartbeat or helping your brain understand the words on this page.

Defler paced the front of the classroom telling us how mitosis — the process of cell division — makes it possible for embryos to grow into babies, and for our bodies to create new cells for healing wounds or replenishing blood we've lost. It was beautiful, he said, like a perfectly choreographed dance.

All it takes is one small mistake anywhere in the division process for cells to start growing out of control, he told us. Just one enzyme misfiring, just one wrong protein activation, and you could have cancer. Mitosis goes haywire, which is how it spreads.

"We learned that by studying cancer cells in culture," Defler said. He grinned and spun to face the board, where he wrote two words in enormous print: HENRIETTA LACKS.

I had the idea that I'd write a book that was a biography of both the cells and the woman they came from — someone's daughter, wife, and mother.

From The immortal life of Henrietta Lacks.http://www.npr.org/templates/story/story.php?storyId=123232331

Page 18: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Defining ComplexitySuggested References

• Computational Complexity by Papadimitriou. Addison-Wesley (1994).• Elements of Information Theory by Cover and Thomas. Wiley (1991).• Kaufmann, At Home in the Universe (1996) and Investigations (2002).• Per Bak, How Nature Works: The Science of Self-Organized Criticality (1988)• Gell-Mann, The Quark and the Jaguar (1994)• Ay, Muller & Szkola, Effective Complexity and its Relation to Logical Depth, ArXiv

(2008)

Page 19: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Mitchell Ch. 3Information

• CAS process information– CAS are computers

• Energy is Conserved• Entropy increases: the arrow of time• Statistical mechanics bridges classical Newtonian physics to

thermodynamics: S = k log W • Maxwell’s Demon

– Information costs energy

• Real world information—analyzed for meaning, processed for some outcome, dependent on context

Page 20: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Mitchell Ch. 4Godel & Turing

• Godel: Mathematics is Incomplete“This statement is false.”

• Turing: Mathematics is undecideable

• Turing Machine defines definite procedure• UTM: tape contains both the input I and the machine (or

program) M• I could be the M of another machine (it could even be the

same M)

Page 21: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Turing Machines

1. a tape2. A head that can r/w & move l/r3. Instruction table4. State register

Tape is infinite, all else is finite and discrete

Page 22: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Proof by Contradiction that Math/logic/computation is undecideable

• Turing Statement (to be contradicted): A TM, H, given input I and Machine M will halt in finite time and return Yes if M will halt on I, or No if M will not halt on I

H(M,I) = 1 if M halts on IH(M,I) = 0 if M does not halt on I

• Equivalent to saying we can design an infinite loop detector H

• Why can’t H just run M on I?

Page 23: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Inspiration from Godel

Create H’ that calculates H(M,M) except,H’ (M,M) does not halt if M halts on MH’ (M,M) halts if if M does not halt on M

• What does H’ do when it is its own input?• H’(H’,H’) halts if H’ doesn’t halt on its own input; • H’ doesn’t halt if H’ halts on its own input CONTRADICTION

• Godel encodes logical/mathematical statements so they talk about themselves.

• Turing encoded logic in TM and runs TM on themselves. • Demonstrate that math/logic are incomplete and undecideable

Page 24: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples
Page 25: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

My view of CAS

• Incomplete, UndecidableWe know it when we see it

• Process information• Interactions between components• Hierarchy of components• Result from frozen accidents & arrow of time– Largely from evolution– Exception: climate & weather: memory at different scales

• Predictable at some scales, sometimes

Page 26: Office hours shifted: 2-4 Tuesdays Summary & Discussion of Gell-Mann Definitions & Examples

Logical Depth• Bennett 1986;1990:

– The Logical depth of x is the run time of the shortest program that will cause a UTM to produce x and then halt.

– Logical depth is not a measure of randomness; it is small both for trivially ordered and random strings.

• Drawbacks:– Uncomputable.– Loses the ability to distinguish between systems that can be described by

computational models less powerful than Turing Machines (e.g., finite-state machines).

• Ay et al 2008, Recent proposed proof that strings with high effective complexity also have high logical depth, and low effective complexity have small logical depth.