Top Banner
CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy Abstract Value-learning and perceptual learning have been an important focus over the past decade, attracting the concerted attention of experimental psychologists, neurobiologists and the machine learning community. Despite some formal connections; e.g., the role of prediction error in optimizing some function of sensory states, both fields have developed their own rhetoric and postulates. In work, we show that perceptual learning is, literally, an integral part of value learning; in the sense that perception is necessary to integrate out dependencies on the inferred causes of sensory information. This enables the value of sensory trajectories to be optimized through action. Furthermore, we show that acting to optimize value and perception are two aspects of exactly the same principle; namely the minimization of a quantity (free energy) that bounds the probability of sensory input, given a particular agent or phenotype. This principle can be derived, in a straightforward way, from the very existence of agents, by considering the probabilistic behavior of an ensemble of agents belonging to the same class.
33

CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

CDB Exploring Science and Society SeminarThursday 19 November 2009 at 5.30pm

Host: Prof Giorgio Gabella

The Bayesian brain, surprise and free-energy

Abstract

Value-learning and perceptual learning have been an important focus over the past decade, attracting the concerted attention of experimental psychologists, neurobiologists and the machine learning community. Despite some formal connections; e.g., the role of prediction error in optimizing some function of sensory states, both fields have developed their own rhetoric and postulates. In work, we show that perceptual learning is, literally, an integral part of value learning; in the sense that perception is necessary to integrate out dependencies on the inferred causes of sensory information. This enables the value of sensory trajectories to be optimized through action. Furthermore, we show that acting to optimize value and perception are two aspects of exactly the same principle; namely the minimization of a quantity (free energy) that bounds the probability of sensory input, given a particular agent or phenotype. This principle can be derived, in a straightforward way, from the very existence of agents, by considering the probabilistic behavior of an ensemble of agents belonging to the same class.

Page 2: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

“Objects are always imagined as being present in the field of vision as would have to be there in order to produce the same impression on the nervous mechanism” - Hermann Ludwig Ferdinand von Helmholtz

Thomas Bayes

Geoffrey Hinton

Richard FeynmanFrom the Helmholtz machine and the Bayesian Brain

toAction and self-organization

Hermann Haken

Page 3: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Overview

Ensemble dynamicsEntropy and equilibriaFree-energy and surprise

The free-energy principleAction and perceptionGenerative models

PerceptionBirdsong and categorizationSimulated lesions

ActionActive inferenceReaching

PoliciesControl and attractorsThe mountain-car problem

Page 4: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Particle density contours showing Kelvin-Helmholtz instability, forming beautiful breaking waves. In the self-sustained state of Kelvin-Helmholtz turbulence the particles are transported away from the mid-plane at the same rate as they fall, but the particle density is nevertheless very clumpy because of a clumping instability that is caused by the dependence of the particle velocity on the local solids-to-gas ratio (Johansen, Henning, & Klahr 2006)

1

2

temperature

pH

falling

transport

( | )p m( | )p m

Self-organization that minimises an ensemble density

to ensure a limited repertoire of states are occupied (i.e., ensuring states have a random attracting set).

( | ) ln ( | )H p m p m d

( )A

Page 5: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

How can an active agent minimise its equilibrium entropy? This entropy is bounded by the entropy of sensory signals (under simplifying assumptions)

Crucially, because the density on sensory signals is at equilibrium, it can be interpreted as the proportion of time each agent entertains them (the sojourn time). This ergodic argument means that entropy is the path integral of surprise experienced by a particular agent:

This means agents minimise surprise at all times. But there is one small problem… Agents cannot access surprise; however, they can evaluate a free-energy bound on surprise, which is induced with a recognition density q :

0

1( | ) ln ( | ) lim ln ( ( ) | )

T

sT

H p s m p s m ds p s t m dtT

( )

( ) ln | |s

s z

H H p d

g

g

( ( ), ( )) ln ( ( ) | )F s t q p s t m

Page 6: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Overview

Ensemble dynamicsEntropy and equilibriaFree-energy and surprise

The free-energy principleAction and perceptionGenerative models

PerceptionBirdsongSimulated lesions

ActionActive inference

Reaching

PolicesControl and attractors

The mountain-car problem

Page 7: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Action

( , )s x z g

argmin ( , )a

a F s External states in the world

Internal states of the agent (m)

Sensations

The free-energy principle

argmin ( , )F s

( , , )x x a w f

Action to minimise a bound on surprise Perception to optimise the bound

ln ( , | ) ln ( )q q

F Energy Entropy p s m q

( || ( )) ln ( ( ) | , )

argmax

q

a

F D q p p s a m

Complexity Accuracy

a Accuracy

( ( | ) || ( | )) ln ( | )

argmin

F D q p s p s m

Divergence Surprise

Divergence

{ ( ), , }x t

Page 8: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

The free-energy rests on expected Gibb’s energy

and can be evaluated, given a generative model comprising a likelihood and prior:

So what models might the brain use?

( , ) lnq q

F s q Energy Entropy U q

( , ) ln ( , | ) ln ( | , ) ln ( | )U s p s m p s m p m

The generative model

{ ( ), , }x t

Page 9: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Processing hierarchy

Backward(nonlinear)

Forward(linear)

lateralEnsemble dynamics

Entropy and equilibriaFree-energy and surprise

The free-energy principleAction and perceptionGenerative models

PerceptionBirdsongSimulated lesions

ActionActive inference

Reaching

PolicesControl and attractors

The mountain-car problem

Page 10: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

)1(~x )1(

s

)2((2)

(1)

)2(~x

)2(~v

)1(~v

x~

s

v~

( | , , , ) ( , ( ))sp s v x N g

( | , , , ) ( , ( ))xp x x v N f

( ) ( , )p v N

( , , )

( , , )

s g zs g x v z

x f x v w Dx f w

Hierarchical (deep) dynamic models

( 1) ( ) ( )( 1) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

( , , )

( , , )

i i ii i i i i

i i i i i i i i

v g zv g x v z

x f x v w Dx f w

{ ( ), ( ), , }x t v t

Page 11: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

1( ) ( ) ( ) ( ) ( ) ( 1) ( 1)

1

( ) ( ) ( ) ( )

( ) ( 1) ( 1) ( 1) ( )

, , | | , ,

, ( ) ( ) ( | ) ( | , )

| ( , )

| , ( , )

mm i i i i i i

i

i i i i x

i i i i i v

p s x v m p s x v p x v

p x v p v p x p Dx v p v x v

p Dx v N f

p v x v N g

Structural priors

Dynamical priors

Likelihood and empirical priors

(1) (1) (1)

(1) (1) (1) (1)

( 1) ( ) ( ) ( )

( ) ( ) ( ) ( )

( , )

( , )

( , )

( , )

i i i i

i i i i

s g x v z

x f x v w

v g x v z

x f x v w

(1)

(1)

( )

( )

v

m

m

v

x

s g

v

g

v

v g

Dx f

Hierarchal form

1 12 2

( , ) ln , , |

ln T

U s p s x v m

Gibb’s energy: a simple function of prediction error

Prediction errors

Page 12: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

,x v

Synaptic gain

Synaptic activity Synaptic efficacy

Activity-dependent plasticity

Functional specialization

Attentional gain

Enabling of plasticity

Attention and salience

F

Perception and inference Learning and memory

The recognition density and its sufficient statistics

F

xx

vv

F

F

Mean-field approximation: ( ) ( ) ( , ) ( ) ( )

( ) ( , )

i

i

i i i

q q q x v q q

q N

Laplace approximation:

Page 13: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Backward predictions

Forward prediction error

( )i x

( )i x

( )i v

( 1)i v

( )s t

( )i v( 1)i x

( 1)i x

( 1)i v

( 2)i v

Perception and message-passing

( ) ( ) ( ) ( ) ( 1)

( ) ( ) ( ) ( )

i v i v i T i i vv

i x i x i T ix

D

D

12 ( ( ( )))T

i itr R ij

Tij

Synaptic plasticity

( ) ( ) ( ) ( ) ( 1) ( )

( ) ( ) ( ) ( ) ( ) ( )

( ( ))

( ( ))

i v i v i v i v i v i

i x i x i x i x i x i

g

D f

Synaptic gain

Page 14: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Overview

Ensemble dynamicsEntropy and equilibriaFree-energy and surprise

The free-energy principleAction and perceptionGenerative models

PerceptionBirdsong and categorizationSimulated lesions

ActionActive inference

Reaching

PolicesControl and attractors

The mountain-car problem

Page 15: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Synthetic song-birds

SyrinxVocal centre

1

2

vv

v

Time (sec)

Freq

uenc

y

Sonogram

0.5 1 1.5

2 1

1 1 3 1 2

1 2 2 3

18 18

( , ) 2

2

x x

x x v v x x x x

x x v x

f

Page 16: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

x

x

v( )s t

v

10 20 30 40 50 60-5

0

5

10

15

20

prediction and error

time

10 20 30 40 50 60-5

0

5

10

15

20hidden states

time

Backward predictions

Forward prediction error

10 20 30 40 50 60-10

-5

0

5

10

15

20

Causal states

time (bins)

2 1

1 1 3 1 2

1 2 2 3

18 18

( ) 2

2

x x

f x v x x x x

x x v x

Recognition and message passing

stimulus

0.2 0.4 0.6 0.82000

2500

3000

3500

4000

4500

5000

time (seconds)

Page 17: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Perceptual categorization

Freq

uenc

y (H

z) Song A

0.2 0.4 0.6 0.82000

3000

4000

5000

1v

2v

10 15 20 25 30 351

1.5

2

2.5

3

3.5

C B A

time (seconds)

Song B

0.2 0.4 0.6 0.82000

3000

4000

5000

Song C

0.2 0.4 0.6 0.82000

3000

4000

5000

ABC

time (seconds)

0 0.2 0.4 0.6 0.8 1-20

-10

0

10

20

30

40

50

1v

2v

Page 18: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Generative models of birdsong: sequences of sequences

SyrinxNeuronal hierarchy

Time (sec)

Freq

uenc

y (K

Hz)

sonogram

0.5 1 1.5(2) (2)2 1

(2) (2) (2) (2) (2)1 3 1 2

(2) (2) (2)81 2 33

(2) (1)(2) 2 1

(2) (1)3 2

18 18

32 2

2

x x

x x x x

x x x

x v

x v

f

g

(1)1(1)2

v

v

(1)1

(1)2

x

x

Kiebel et al

(1) (1)2 1

(1) (1) (1) (1) (1) (1)1 1 3 1 2

(1) (1) (1) (1)1 2 2 3

(1)1(1) 2

(1)23

18 18

2

2

x x

v x x x x

x x v x

sx

sx

f

g

Page 19: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Freq

uenc

y (H

z)

percept

1 1.52000

2500

3000

3500

4000

4500

5000

Freq

uenc

y (H

z)no structural priors

1 1.52000

2500

3000

3500

4000

4500

5000

time (seconds)

Freq

uenc

y (H

z)

no dynamical priors

0.5 1 1.52000

2500

3000

3500

4000

4500

5000

0 500 1000 1500 2000-40

-20

0

20

40

60

LFP

(micr

o-vo

lts)

LFP

0 500 1000 1500 2000-60

-40

-20

0

20

40

60

LFP

(micr

o-vo

lts)

LFP

0 500 1000 1500 2000-60

-40

-20

0

20

40

60

peristimulus time (ms)

LFP

(micr

o-vo

lts)

LFP

Simulated lesion studies: a model for false inference in psychopathology?

Page 20: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Ensemble dynamicsEntropy and equilibriaFree-energy and surprise

The free-energy principleAction and perceptionGenerative models

PerceptionBirdsongSimulated lesions

ActionActive inferenceReaching

PolicesControl and attractors

The mountain-car problem

Page 21: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Taa

s

predictionFrom reflexes to action

( ( ) ( ))s s s a g a

( )g

action

( )s a

dorsal root

ventral horn

( , , )

( , , , )

s x v z

x x v a w

g

f

True dynamics

( , , )

( , , )

s g x v z

x f x v w

Generative model

Page 22: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

From reflexes to action

a

Vs w

J

1

2

xs w

x

(1)v (1)x

(1)v

(1)v

1J

1x

2x2J

(0,0)

Jointed arm

1 2 1 2( , )J J J j j

1 2 3( , , )V v v v

Movement trajectory(2)v(1)x

Descending sensory prediction

error

visual input

proprioceptive input

Page 23: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Overview

Ensemble dynamicsEntropy and equilibriaFree-energy and surprise

The free-energy principleAction and perceptionGenerative models

PerceptionBirdsongSimulated lesions

ActionActive inferenceReaching

PolicesControl and attractorsThe mountain-car problem

Page 24: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

( ( , ) || ( ) ( )) ( ) ln | |xD p s z p s p z p x dx g

Energies

ln ( ( ) | )q p s a d sensory prediction error

ln ( | )p s m f sensory surprise

ln ( | )p x m surprise

( , ( | ))F s q free-energy

( ( ) || ( )) 0D q p complexity

How do policies minimise entropy?

( ( ) || ( | )) 0D q p s perceptual divergence

Path integrals

ln ( | )xH dt p x m

ln ( | )sH dt p s m

( , )A dtF s q

sensory entropy

entropy

Under ergodic assumptions

free-action

perceptionargmin F

policy (model)argminm A

actionargmina F

Page 25: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

ln

( ) 0

f V W

V p

V W

0

2

( ) ( ) ( , | )

( ) ( ) ( ) ( )

V x c p t m d dt

c x f V x V x

x x x

Richard Bellman

Cost-functions, value and optimal control(polices that lead to sparse distal goals)

Using the Helmholtz decomposition flow (i.e., policy) can be expressed in terms of scalar and vector potentials

Where value V is proportional to negative surprise and can be defined as expected (negative) cost

This means the cost-function is defined by the equilibrium density but not vice versa; this is the problem addressed by dynamic programming and reinforcement learning.

Surprise (negative value)

-20 0 20-30

-20

-10

0

10

20

30

Cost-function

-20 0 20-30

-20

-10

0

10

20

30

Equilibrium density

-20 0 20-30

-20

-10

0

10

20

30

Flow (policy)

-20 0 20-30

-20

-10

0

10

20

30

lnV p ( )c x

: ( | )p p x m

Page 26: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

2

2 ( )( ) |

p f pp f p p f p x m

f

0 :x

f xf c x

f cx

A

( | ) 0 : ( )

( | ) 0 :

c x m x

c x m x

A A

A

Cost-functions and attracting sets(polices with attractors)

At equilibrium we have:

( )c x

A

( )x

0

This means maxima of the equilibrium density must have negative divergence.We can exploit this to ensure maxima lie in A, using a Langevin-based policy; where cost plays the role of dissipation

Adriaan Fokker Max Planck

Page 27: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

x

f x

f cx

( )x

( )x ( )x

( ) 0c x f

equations of motion

( )c x

0

( ) 0c x f Exploitation

exploration

Exploration and exploitation under Langevin dynamics

Page 28: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

18( ) x

xx

a xx

f

True equations of motion

-2 -1 0 1 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

position

( , )x

( )x

heig

ht

The mountain car problem

position satiety

The cost-function

x

xxf

cxx

Policy (expected equations of motion)

( , )c x h

( )h( )x

The environment

Page 29: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

20 40 60 80 100 120

-1

-0.5

0

0.5

prediction and error

time20 40 60 80 100 120

-1

-0.5

0

0.5

hidden states

time

-2 -1 0 1 2-2

-1

0

1

2

position

velo

city

Trajectory of one trial

-2 -1 0 1 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

position

heig

ht

leaned (after 16 trials) and true potential

( )xt

( )x t

( , )x

Learning the environment

With no cost (i.e., Hamiltonian

dynamics)

2 ( , ) 0h c x h

( , )x

Page 30: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

20 40 60 80 100 120 -5

0

5

10

15

20

25

30conditional expectations

20 40 60 80 100 120-3

-2

-1

0

1

2

3

time

action

( )a t

( )xt

( )c t

-2 -1 0 1 2-2

-1

0

1

2

velo

city

trajectories

( )x

( )x

-2 -1 0 1 2-30

-25

-20

-15

-10

-5

0

5

position

forc

e

cost-function (priors)

( ,0)c xWith cost (i.e., exploratory

dynamics)

0h

Exploring & exploiting the environment

Page 31: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Using just the free-energy principle and a simple gradient ascent scheme, we have solved a benchmark problem in optimal control theory using a handful of learning trials. Note that we use reinforcement learning or dynamic programming.

Adaptive policies and trajectories

( )x

( )x

Page 32: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

200 400 600 800 1000 1200 1400 1600-4

-2

0

2

4

6

time

action

-2

0

2

-2

0

20

2

4

6

8

position

trajectories

velocity

satie

ty

200 400 600 800 1000 1200 1400 1600-4

-2

0

2

4

6

8

10prediction and error

time200 400 600 800 1000 1200 1400 1600

-5

0

5

10

15

20

25expected hidden states

time

Self-organisation with (happiness) dynamics on cost

x

x x

f x cx

h c h

policy (expected flow)

( , ( ))c x h t

( )h( )x

18( ) x

x x

x a x

h c h

f

true flow

( )c t( )h t

( )a t

Page 33: CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Thank you

And thanks to collaborators:

Jean DaunizeauStefan KiebelJames Kilner

Klaas Stephan

And colleagues:

Peter DayanJörn DiedrichsenPaul Verschure

Florentin Wörgötter