Top Banner
A Numerical Method for the Evaluation of Kolmogorov Complexity Hector Zenil Amphith´ atre Alan M. Turing Laboratoire d’Informatique Fondamentale de Lille (UMR CNRS 8022) Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 1 / 39
41

A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Jan 12, 2015

Download

Education

Hector Zenil

We present a novel alternative method (other than using compression algorithms) to approximate the algorithmic complexity of a string by calculating its algorithmic probability and applying Chaitin-Levin's coding theorem.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

A Numerical Method for the Evaluationof Kolmogorov Complexity

Hector Zenil

Amphitheatre Alan M. TuringLaboratoire d’Informatique Fondamentale de Lille

(UMR CNRS 8022)

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 1 / 39

Page 2: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Foundational Axis

As pointed out by Greg Chaitin (thesis report of H. Zenil):

The theory of algorithmic complexity is of course now widelyaccepted, but was initially rejected by many because of the factthat algorithmic complexity is on the one hand uncomputableand on the other hand dependable on the choice of universalTuring machine.

This last drawback is specially restrictive for real world applicationsbecause the dependency is specially true for short strings, and a solutionto this problem is at the core of this work.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 2 / 39

Page 3: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Foundational Axis (cont.)

The foundational departure point of the thesis is based in a rather butapparent contradiction, pointed out by Greg Chaitin (same thesis report):

... the fact that algorithmic complexity is extremely, dare I sayviolently, uncomputable, but nevertheless often irresistible toapply ...

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 3 / 39

Page 4: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Algorithmic Complexity

Foundational Notion

A string is random if it is hard to describe.A string is not random if it is easy to describe.

Main Idea

The theory of computation replaces descriptions with programs. Itconstitutes the framework of algorithmic complexity:

description ⇐⇒ computer program

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 4 / 39

Page 5: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Algorithmic Complexity (cont.)

Definition

[Kolmogorov(1965), Chaitin(1966)]

K (s) = min{|p|,U(p) = s}

The algorithmic complexity K (s) of a string s is the length of the shortestprogram p that produces s running on a universal Turing machine U.

The formula conveys the following idea: a string with low algorithmiccomplexity is highly compressible, as the information that it contains canbe encoded in a program much shorter in length than the length of thestring itself.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 5 / 39

Page 6: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Algorithmic Randomness

Example

The string 010101010101010101 has low algorithmic complexity because itcan be described as 18 times 01, and no matter how long it grows inlength, if the pattern repeats the description (k times 01) increases onlyby about log(k), remaining much shorter than the length of the string.

Example

The string 010010110110001010 has high algorithmic complexity becauseit doesn’t seem to allow a (much) shorter description other than the stringitself, so a shorter description may not exist.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 6 / 39

Page 7: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Example of an evaluation of KThe string 01010101...01 can be produced by the following program:

Program A:1: n:= 02: Print n3: n:= n+1 mod 24: Goto 2

The length of A (in bits) is an upper bound of K (010101...01).

Connections to predictability: The program A trivially allows a shortcut tothe value of an arbitrary digit through the following function f(n):

if n = 2m then f (n) = 1, f (n) = 0 otherwise.

Predictability characterization (Shnorr) [Downey(2010)]

simple ⇐⇒ predictablerandom ⇐⇒ unpredictable

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 7 / 39

Page 8: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Noncomputability of K

The main drawback of K is that it is not computable and thus can only beapproximated in practice.

Important

No algorithm can tell whether a program p generating s is the shortest(due to the undecidability of the halting problem of Turing machines).

No absolute notion of randomness

It is impossible to prove that a program p generating s is the shortestpossible, also implying that if a program is about the length of the originalstring one cannot tell whether there is a shorter program producing s.Hence, there is no way to declare a string to be truly algorithmic random.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 8 / 39

Page 9: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Structure vs. randomness

Formal notion of structure

One can exhibit, however, a short program generating s (much) shorterthan s itself. So even though one cannot tell whether a string is randomone can declare s not random if a program generating s is (much) shorterthan the length of s.

As a result, one can only find upper bounds of K and s cannot be morecomplex than the length of that shortest known program producing s.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 9 / 39

Page 10: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Most strings have maximal algorithmic complexity

Even if one cannot tell when a string is truly random it is known moststrings cannot have much shorter generating programs by a simplecombinatoric argument:

There are exactly 2n bit strings of length n,

But there are only 20 + 21 + 22 + . . .+ 2(n−1) = 2n − 1 bit strings offewer bits. (in fact there is one that cannot be compressed even by asingle bit)

Hence, there are considerably less short programs than long programs.

Basic notion

One can’t pair-up all n-length strings with programs of much shorter length(there simply aren’t enough short programs to encode all longer strings).

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 10 / 39

Page 11: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

The choice of U matters

A major criticism brought forward against K is its dependence of universalTuring machine U. From the definition:

K (s) = min{|p|,U(p) = s}

It may turn out that:

KU1(s) 6= KU2(s) when evaluated respectively using U1 and U2.

Basic notion

This dependency is particularly troubling for short strings, shorter than forexample the length of the universal Turing machine on which K of thestring is evaluated (typically in the order of hundreds of bits as originallysuggested by Kolmogorov himself).

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 11 / 39

Page 12: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

The Invariance theorem

A theorem guarantees that in the long term different algorithmiccomplexity evaluations will converge to the same values as the length ofthe strings grow.

Theorem

Invariance theorem If U1 and U2 are two (universal) Turing machines andKU1(s) and KU2(s) the algorithmic complexity of a binary string s whenU1 or U2 are used respectively, there exists a constant c such that for allbinary string s:

|KU1(s)− KU2(s)| < c

(think of a compiler between 2 programming languages)

Yet, the additive constant can be arbitrarily large, making unstable (if notimpossible) to evaluate K (s) for short strings.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 12 / 39

Page 13: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Theoretical holes

1 Finding a stable framework for calculating the complexity of shortstrings (one wants to have short strings like 000...0 to be alwaysamong the less algorithmic random despite any choice of machine.

2 Pathological cases: Theory says that a single bit has maximal randomcomplexity because the greatest possible compression is evidently thebit itself (paradoxically it is the only finite string for which one can besure it cannot be compressed further), yet one would intuitively saythat a single bit is among the simplest strings.

We try to fill these holes by introducing the concept of algorithmicprobability as an alternative evaluation tool for calculating K (s).

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 13 / 39

Page 14: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Algorithmic Probability

There is a measure that describes the expected output of a randomprogram running on a universal Turing machine.

Definition

[Levin(1977)]m(s) = Σp:U(p)=s1/2|p| i.e. the sum over all the programs for which U (aprefix free universal Turing machine) with p outputs the string s and halts.

m is traditionally called Levin’s semi-measure, Solomonof-Levin’ssemi-measure or the Universal distribution [Kirchherr and Li(1997)].

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 14 / 39

Page 15: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

The motivation for Solomonoff-Levin’s m(s)

Borel’s typewriting monkey metaphor1 is useful to explain the intuitionbehind m(s):

If you were going to produce the digits of a mathematical constant like πby throwing digits at random, you would have to produce every digit of itsinfinite irrational decimal expansion.

If you place a monkey on a typewriter (with say a 50 keys typewriter), theprobability of the monkey typing an initial segment of 2400 digits of π bychance is (1/502400).

1Emile Borel (1913) “Mecanique Statistique et Irreversibilite” and (1914)“Le hasard”.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 15 / 39

Page 16: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

The motivation for Solomonoff-Levin’s m(s) (cont.)

But if instead, the monkey is placed on a computer, the chances ofproducing a program generating the digits of π are of only 1/50158

because it would take the monkey only 158 characters to produce the first2400 digits of π using, for example, this C language code:

int a = 10000, b, c = 8400, d, e, f[8401], g; main(){for(; b-c; )f[b + +] = a/5; for(; d = 0, g = c ∗ 2; c- = 14, printf(“%.4d”, e + d/a),e = d%a)for(b = c; d+ = f[b] ∗ a, f[b] = d%–g, d/ = g–, –b; d∗ = b);

Implementations in any programming language, of any of the many knownformulae of π are shorter than the expansions of π and have thereforegreater chances to be produced by chance than producing the digits of πone by one.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 16 / 39

Page 17: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

More formally said

Randomly picking a binary string s of length k among all (uniformlydistributed) strings of the same length has probability 1/2k .

But the probability to find a binary program p producing s (upon halting),among binary programs running on a Turing machine U is at least 1/2|p|

such that U(p) = s (we know that such a program exists because U is auniversal Turing machine)

Because |p| ≤ k (e.g. the example for π described before), a string s witha short generating program will have greater chances to have beenproduced by p rather than by writing down all k bits of s one by one.

The less random a string the more likely to be produced by a shortprogram.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 17 / 39

Page 18: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Towards a semi-measureHowever, there is an infinite number of programs producing s, so theprobability of picking a program producing s among all possible programsis ΣU(p)=s1/2|p|, the sum of all the programs producing s running on theuniversal Turing machine U.Nevertheless, for a measure to be a probability measure, the sum of allpossible events should add up 1. So ΣU(p)=s1/2|p| cannot be a probabilitymeasure given that there is an infinite number of programs contributing tothe overall sum. For example, the following two programs 1 and 2 producethe string 0.

1: Print 0

and:

1: Print 02: Print 13: Erase the previous 1

and there are (countably) infinitely many more.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 18 / 39

Page 19: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Towards a semi-measure (cont.)

So for m(s) to be a probability measure, the universal Turing machine Uhas to be a prefix-free Turing machine, that is a machine that does notaccept as a valid program one that has another valid program in itsbeginning, e.g. program 2 starts with program 1, so if program 1 is a validprogram then program 2 cannot be a valid one.

The set of valid programs is said to form a prefix-free set, that is noelement is a prefix of any other, a property necessary to keep0 < m(s) < 1. For more details see (Kraft’s inequality [Calude(2002)]).

However, some programs halt or some others don’t (actually, most do nothalt), so one can only run U and see what programs produce scontributing to the sum. It is said then, that m(s) is semi-computablefrom below, and therefore is considered a probability semi-measure (asopposed to a full measure).

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 19 / 39

Page 20: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Some properties of m(s)

Solomonoff and Levin proved that, in absence of any other information,m(s) dominates any other semi-measure and is, therefore, optimal in thissense (hence also its universal adjective).

On the other hand, the greatest contributor in the summation of programsΣU(p)=s1/2|p| is the shortest program p, given that it is when the

denominator 2|p| reaches its smallest value and therefore 1/2|p| its greatestvalue. The shortest program p producing s is nothing but K (s), thealgorithmic complexity of s.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 20 / 39

Page 21: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

The coding theorem

The greatest contributor in the summation of programs ΣU(p)=s1/2|p| is

the shortest program p, given that it is when the denominator 2|p| reachesits smallest value and therefore 1/2|p| its greatest value. The shortestprogram p producing s is nothing but K (s), the algorithmic complexity ofs. The coding theorem [Levin(1977), Calude(2002)] describes thisconnection between m(s) and K (s):

Theorem

K (s) = −log2(m(s)) + c

Notice that the coding theorem reintroduces an additive constant! Onemay not get rid of it, but the choices related to m(s) are much lessarbitrary than picking a universal Turing machine directly for K (s).

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 21 / 39

Page 22: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

An additive constant in exchange for a massivecomputation

The trade-off this is, however, that the calculation of m(s) requires anextraordinary power of computation.

As pointed out by J.-P. Delahaye concerning our method (Pour LaScience, No. 405 July 2011 issue):

Comme les durees ou les longueurs tres petites, les faiblescomplexites sont delicates a evaluer. Paradoxalement, lesmethodes d’evaluation demandent des calculs colossaux.

The first description of our approach was published in Greg Chaitin’sfestchrift volume for his 60th. anniversary: J-P. Delahaye & H. Zenil,“On the Kolmogorov-Chaitin complexity for short sequences,” Randomness andComplexity: From Leibniz to Chaitin, edited by C.S. Calude, World Scientific,2007.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 22 / 39

Page 23: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Calculating an experimental m

Main idea

To evaluate K (s) one can calculate m(s). m(s) is more stable than K (s)because one makes less arbitrary choices on a Turing machine U.

Definition

D(n) = the function that assigns to every finite binary string s thequotient:(# of times that a machine (n,2) produces s) / (# of machines in (n,2)).

D(n) is the probability distribution of the strings produced by all n-state2-symbol Turing machines (denoted by (n,2)).

Examples for n = 1, n = 2 (normalized by the # of machines thathalt)

D(1) = 0→ 0.5; 1→ 0.5D(2) = 0→ 0.328; 1→ 0.328; 00→ .0834 . . .

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 23 / 39

Page 24: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Calculating an experimental m (cont.)

Definition

[T. Rado(1962)]A busy beaver is a n-state, 2-color Turing machine which writes amaximum number of 1s before halting or performs a maximum number ofsteps when started on an initially blank tape before halting.

Given that the Busy Beaver function values are known for n-state 2-symbolTuring machines for n = 2, 3, 4 we could compute D(n) for n = 2, 3, 4.

We ran all 22 039 921 152 two-way tape Turing machines starting with atape filled with 0s and 1s in order to calculate D(4)2

Theorem

D(n) is noncomputable (by reduction to Rado’s Busy Beaver problem).

2A 9-day calculation on a single 2.26 Core Duo Intel CPU.Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 24 / 39

Page 25: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Complexity Tables

Table: The 22 bit-strings in D(2) from 6 088 (2,2)-Turing machines that halt.[Delahaye and Zenil(2011)]

0 → .328 010 → .000651 → .328 101 → .0006500 → .0834 111 → .0006501 → .0834 0000 → .0003210 → .0834 0010 → .0003211 → .0834 0100 → .00032001 → .00098 0110 → .00032011 → .00098 1001 → .00032100 → .00098 1011 → .00032110 → .00098 1101 → .00032000 → .00065 1111 → .00032

Solving degenerate cases

“0” is the simplest string (together with “1”) according to D.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 25 / 39

Page 26: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Partial D(4) (top strings)

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 26 / 39

Page 27: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

From a Prior to an Empirical DistributionWe see algorithmic complexity emerging:

1 The classification goes according to our intuition of what complexityshould be.

2 Strings are almost always classified by length except in cases in whichintuition justifies they should not. For ex. even though 0101010 is oflength 7, it came better ranked than some strings of length shorterthan 7. One sees emerging the low random complexity of 010101...as a simple string.

From m to D

Unlike m, D is an empirical distribution and no longer a prior. Dexperimentally confirms the intuition behind Solomonoff and Levin’smeasure.

Full tables are available online: www.algorithmicnature.org

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 27 / 39

Page 28: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Miscellaneous facts from D(3) and D(4)

There are 5 970 768 960 machines that halt among the 22 039 921 152in (4,2). That is a fraction of 0.27 halt.

Among the most random looking group strings from D(4) there are :0, 00, 000..., 01, 010, 0101, etc.

Among the most random looking strings one can find:1101010101010101, 1101010100010101, 1010101010101011 and1010100010101011, each with frequency of 5.4447×10−10.

As in D(3), where we reported that one string group (0101010 and itsreversion) climbed positions, in D(4) 399 strings climbed to the topand were not sorted among their length groups.

In D(4) string length was no longer a classification determinant. Forexample, between positions 780 and 790, string lengths are: 11, 10,10, 11, 9, 10, 9, 9, 9, 10 and 9 bits.

D(4) preserves the string order of D(3) except in 17 places out of 128strings in D(3) ordered from highest to lowest string frequency.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 28 / 39

Page 29: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Connecting D back to m

To get m we replaced a uniform distribution of bits composing strings to auniform distribution bits composing programs. Imagine that your(Turing-complete) programming language allows a monkey to producerules of Turing machines at random, every time that the monkey types avalid program it is executed.

At the limit, the monkey (which is just a random source of programs) willend up covering a sample of the space of all possible Turing machine rules.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 29 / 39

Page 30: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Connecting D back to m

On the other hand, D(n) for a fixed n is the result of running all n-state2-symbol Turing machines according to an enumeration.

An enumeration is just a thorough sample of the space of all n-state2-symbol Turing machines each with fixed probability1/(# of Turing machines in (n,2)) (by definition of enumeration).

D(n) is therefore, a legitimate programmer monkey experiment. Theadditional advantage of performing a thorough sample of Turing machinesby following an enumeration is that the order in which the machines aretraversed in the enumeration is irrelevant as long as one covers all theelements of a (n,2) space.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 30 / 39

Page 31: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Connecting D back to m (cont.)

One may ask why shorter programs are favored.

The answer, in analogy to the monkey experiment, is based on the uniformrandom distribution of keystrokes: programs cannot be that long withouteventually containing the ending program keystroke. One can still thinkthat one can impose a different distribution of the program instructions,for ex. changing the keyboard distribution repeating certain keys.

Choices other than the uniform are more arbitrary than just assuming noadditional information, and therefore a uniform distribution (a keyboardwith two or more letter “a”’s rather than the usual one seems morearbitrary than having a key per letter).

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 31 / 39

Page 32: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Connecting D back to m (cont.)

Every D(n) is a sample of D(n + 1) because (n + 1, 2) contains allmachines in (n, 2). We have empirically tested that strings sorted byfrequency in D(4) preserve the order of D(3) which preserves the order ofD(2), meaning that longer programs do not produce completely differentclassifications. One can think of the sequence D(1),D(2),D(3),D(4), . . .as samples which values are approximations to m.

One may also ask, how can we know whether a monkey provided with adifferent programming language would produce a completely different D,and therefore yet another experimental version of m. That may be thecase, but we have also shown that reasonable programming languages(e.g. based on cellular automata and Post tag systems) producereasonable (correlated) distributions.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 32 / 39

Page 33: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Connecting D back to m (cont.)

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 33 / 39

Page 34: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

m(s) provides a formalization for Occam’s razor

The immediate consequence of algorithmic probability is simple butpowerful (and surprising):

Basic notion

Type-writing monkeys (Borel)garbage in → garbage out

Programmer monkeys: (Bennett, Chaitin)garbage in → structure out

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 34 / 39

Page 35: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

What m(s) may tell us about the physical world?

Basic notion

m(s) tells that it is unlikely that a Rube Goldberg machine produces astring if the string can be produced by a much simpler process.

Physical hypothesis

m(s) would tell that, if processes in the world are computer-like, it isunlikely that structures are the result of the computation of a RubeGoldberg machine. Instead, they would rather be the result of the shortestprograms producing that structures and patterns would follow thedistribution suggested by m(s).

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 35 / 39

Page 36: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

On the algorithmic nature of the world

Could it be that m(s) tells us how structure in the world has come to beand how is it distributed all around? Could m(s) reveal the machinerybehind?

What happens in the world is often the result of an ongoing (mechanical)process (e.g. the Sun rising due to the mechanical celestial dynamics ofthe solar system).

Can m(s) tell something about the distribution of patterns in the world?We decided to see so we got some empirical datasets from the physicalworld and made a comparison against data produced by pure computationthat by definition should follow m(s).

The results were published in H. Zenil & J-P. Delahaye, “On theAlgorithmic Nature of the World”, in G. Dodig-Crnkovic and M. Burgin (eds),Information and Computation, World Scientific, 2010.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 36 / 39

Page 37: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

On the algorithmic nature of the world

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 37 / 39

Page 38: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Conclusions

Our method aimed to show that reasonable choices of formalisms forevaluating the complexity of short strings through m(s) give consistentmeasures of algorithmic complexity.

[Greg Chaitin (w.r.t our method)] ...the dreaded theoretical holein the foundations of algorithmic complexity turns out, inpractice, not to be as serious as was previously assumed.

Our method also seems notable in that it is an experimental approach thatcomes into the rescue of the apparent holes left by the theory.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 38 / 39

Page 39: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

Bibliography

C.S. Calude, Information and Randomness: An AlgorithmicPerspective (Texts in Theoretical Computer Science. An EATCSSeries), Springer, 2nd. edition, 2002.

G. J. Chaitin. On the length of programs for computing finite binarysequences. Journal of the ACM, 13(4):547–569, 1966.

G. Chaitin, Meta Math!, Pantheon, 2005.

R.G. Downey and D. Hirschfeldt, Algorithmic Randomness andComplexity, Springer Verlag, to appear, 2010.

J.P. Delahaye and H. Zenil, On the Kolmogorov-Chaitin complexity forshort sequences, in Cristian Calude (eds) Complexity and Randomness:From Leibniz to Chaitin. World Scientific, 2007.

J.P. Delahaye and H. Zenil, Numerical Evaluation of AlgorithmicComplexity for Short Strings: A Glance into the Innermost Structureof Randomness, arXiv:1101.4795v4 [cs.IT].

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 39 / 39

Page 40: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

C.S. Calude, M.A. Stay, Most Programs Stop Quickly or Never Halt,2007.

W. Kirchherr and M. Li, The miraculous universal distribution,Mathematical Intelligencer , 1997.

A. N. Kolmogorov. Three approaches to the quantitative definition ofinformation. Problems of Information and Transmission, 1(1):1–7,1965.

P. Martin-Lof. The definition of random sequences. Information andControl, 9:602–619, 1966.

L. Levin, On a concrete method of Assigning Complexity Measures,Doklady Akademii nauk SSSR, vol.18(3), pp. 727-731, 1977.

L. Levin., Universal Search Problems., 9(3):265-266, 1973.(submitted: 1972, reported in talks: 1971). English translation in:B.A.Trakhtenbrot. A Survey of Russian Approaches to Perebor(Brute-force Search) Algorithms. Annals of the History of Computing6(4):384-400, 1984.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 39 / 39

Page 41: A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

M. Li, P. Vitanyi, An Introduction to Kolmogorov Complexity and ItsApplications,, Springer, 3rd. Revised edition, 2008.

S. Lloyd, Programming the Universe: A Quantum Computer ScientistTakes On the Cosmos, Knopf Publishing Group, 2006.

T. Rado, On non-computable functions, Bell System TechnicalJournal, Vol. 41, No. 3, 1962.

R. J. Solomonoff. A formal theory of inductive inference: Parts 1 and2. Information and Control, 7:1–22 and 224–254, 1964.

H. Zenil and J.P. Delahaye, On the Algorithmic Nature of the World,in G. Dodig-Crnkovic and M. Burgin (eds), Information andComputation, World Scientific, 2010.

S. Wolfram, A New Kind of Science, Wolfram Media, 2002.

Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 39 / 39