Top Banner
BCS547 Neural Encoding
51

BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Dec 18, 2015

Download

Documents

Rebecca Tyler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

BCS547

Neural Encoding

Page 2: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Introduction to computational neuroscience

• 10/01 Neural encoding

• 17/01 Neural decoding  

• 24/01 Low level vision

• 31/01 Object recognition

• 7/02 Bayesian Perception

• 21/02 Sensorimotor transformations

Page 3: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Neural Encoding

Page 4: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

What’s a code? Example

Deterministic code:

A ->10

B -> 01

If you see the string 01, recovering the encoded letter (B) is easy.

Page 5: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Noise and coding

Two of the hardest problems with coding come from:

1. Non invertible codes (e.g., two values get mapped onto the same code)

2. Noise

Page 6: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Example

Noisy code:A -> 01 with p=0.8

-> 10 with p=0.2B -> 01 with p=0.3

-> 10 with p=0.7

Now, given the string 01, it’s no longer obvious what the encoded letter is…

Page 7: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

What types of codes and noises are found in the nervous system?

Page 8: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Receptive field

Direction of motion

Stimulus

Response

Code: number of spikes10

Page 9: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Receptive field

Direction of motion

Trial 1

Stimulus

Trial 2

Trial 3

Trial 4

10

7

8

4

Page 10: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Variance of the noise, i()2

Encoded variable ()

Mean activity fi()

Variance, i()2, can depend on the input

Tuning curve fi()

Page 11: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Tuning curves and noise

The activity (# of spikes per second) of a neuron can be written as:

where fi() is the mean activity of the neuron (the tuning curve) and ni is a noise with zero mean. If the noise is gaussian, then:

fi i ia n

0,i in N

Page 12: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Probability distributions and activity

• The noise is a random variable which can be characterized by a conditional probability distribution, P(ni|).

• Since the activity of a neuron is the sum of a deterministic term, fi(), and the noise, it is also a random variable with a conditional probability distribution, P(ai| ).

• The distributions of the activity and the noise differ only by their means (E[ni]=0, E[ai]=fi()).

Page 13: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Activity distribution

P(ai|=-60)

P(ai|=0)

P(ai|=-60)

Page 14: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Gaussian noise with fixed variance

Gaussian noise with variance equal to the mean

Examples of activity distributions

2

22

f1| exp

22

iiP a

2f1

| exp2f2 f

ii

ii

P a

Page 15: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Poisson activity (or noise):

The Poisson distribution works only for discrete random variables. However, the mean, fi(), does not have to be an integer.

The variance of a Poisson distribution is equal to its mean.

f f

|!

ik

ii

eP a k

k

Page 16: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Comparison of Poisson vs Gaussian noise with variance equal to the mean

0 20 40 60 80 100 120 1400

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Activity (spike/sec)

Pro

bab

ilit

y

Page 17: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Poisson noise and renewal process

We bin time into small intervals, t. Then, for each interval, we toss a coin with probability, P(head) =p. If we get a head, we record a spike.

For small p, the number of spikes per second follows a Poisson distribution with mean p/t spikes/second (e.g., p=0.01, t=1ms, mean=10 spikes/sec).

Page 18: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Properties of a Poisson process

• The number of events follows a Poisson distribution (in particular the variance should be equal to the mean)

• A Poisson process does not care about the past, i.e., at a given time step, the outcome of the coin toss is independent of the past.

• As a result, the inter-event intervals follow an exponential distribution (Caution: this is not a good marker of a Poisson process)

Page 19: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Poisson process and spiking

• The inter spike interval (ISI) distribution is indeed close to an exponential except for short intervals (refractory period) and for bursting neurons. (CV close to 1, Softy and Koch, fig. 1 and 3)

• The variance in the spike count is proportional to the mean but the the constant of proportionality is 1.2 instead of 1 and there is spontaneous activity. (Softy and Koch, fig. 5)

Page 20: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Open Questions

• Is this Poisson variability really noise?

• Where does it come from?

• Hard question because dendrites integrate their inputs and average out the noise (Softky and Koch)

Page 21: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Non-Answers

• It’s probably not in the sensory inputs

• It’s not the spike initiation mechanism (Mainen and Sejnowski)

• It’s not the stochastic nature of ionic channels

• It’s probably not the unreliable synapses.

Page 22: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Possible Answers

• Neurons embedded in a recurrent network with sparse connectivity tend to fire with statistics close to Poisson (Van Vreeswick and Sompolinski)

• Random walk model (Shadlen and Newsome; Troyer and Miller)

Page 23: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Problems with the random walk model

• The ratio variance over mean is still smaller than the one measured in vivo (0.8 vs 1.2)

• It’s unstable over several layers! (this is likely to be a problem for any mechanisms…)

• Noise injection in real neurons fails to produce the predicted variability (Stevens, Zador)

Page 24: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Other sources of noise and uncertainty

• Shot noise in the retina

• Physical noise in the stimulus

• Uncertainty inherent to the stimulus (e.g. the aperture problem)

Page 25: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Beyond tuning curves

• Tuning curves are often non invariant under stimulus changes (e.g. motion tuning curves for blobs vs bars)

• Deal poorly with time varying stimulus

• Assume a rate code

An alternative: information theoryAn alternative: information theory

Page 26: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Information Theory(Shannon)

Page 27: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Definitions: Entropy

• Entropy:

• Measures the degree of uncertainty

• Minimum number of bits required to encode a random variable

XPE

XPXP

xXpxXpXH i

N

ii

2

2

21

log

log

log

Page 28: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

P(X=1) = pP(X=0) = 1- p

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Probability (p)

En

tro

py

(bit

s)

Page 29: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

• Maximum entropy is achieved for flat probability distributions, i.e., for distributions in which events are equally likely.

• For a given variance, the normal distribution is the one with maximum entropy

Page 30: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Entropy of Spike Trains

• A spike train can be turned into a binary vector by discretizing time into small bins (1ms or so).

• Computing the entropy of the spike train amounts to computing the entropy of the binary vector.

1 1 1 1 1 1 10 0 0 0 0 0 0 0

Page 31: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Definition: Conditional Entropy

• Conditional entropy:

• Uncertainty due to noise: How uncertain is X knowing Y?

YXPYXP

yYxXP

yYxXPyYPYXH

ji

M

j

N

ijij

|log,

|log

||

2

2

1 1

Page 32: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Example

H(Y|X) is equal to zero if the mapping from X to Y is deterministic and many to one.

Ex: Y = X for odd X

Y= X+1 for even X

• X=1, Y is equal to 2. H(Y|X)=0• Y=4, X is either 4 or 3. H(X|Y)>0

Page 33: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Example

In general, H(X|Y)H(Y|X), except for an invertible and deterministic mapping, in which case H(X|Y)=H(Y|X)=0

Ex: Y= X+1 for all X

• Y=2, X is equal to 1. H(X|Y)=0• X=1, Y is equal to 2. H(Y|X)=0

Page 34: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Example

If Y=f(X)+noise, H(Y|X) and H(X|Y) are strictly greater than zero

Ex: Y is the firing rate of a noisy neuron, X is the orientation of a line: ai=fi()+ni. Knowing the firing rate does not tell you for sure what the orientation is, H(X|Y)= H(|ai)>0.

Page 35: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.
Page 36: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Definition: Joint Entropy

• Joint entropy

• Special case. X and Y independent

XYHXH

YXHYH

YXPYXPYXH

|

)|()(

,log,,

Page 37: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Independent Variables

• If X and Y are independent, then knowing Y tells you nothing about X. In other words, knowing Y does not reduce the uncertainty about X, i.e., H(X)=H(X|Y). It follows that:

YHXH

XYHXHYXH

)|()(,

Page 38: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Entropy of Spike Trains

• For a given firing rate, the maximum entropy is achieved by a Poisson process because they generate the most unpredictable sequence of spikes .

Page 39: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Definition: Mutual Information

• Mutual information

• Independent variables: H(Y|X)=H(Y)

, ( | )

|

I X Y H X H X Y

H Y H Y X

, ( | )

0

I X Y H Y H Y X

H Y H Y

Page 40: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Data Processing Inequality

• Computation and information transmission can only decrease mutual information:

If Z=f(Y), I(Z,X) I(X,Y)

In other words, computation can only decrease information or change its format.

Page 41: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

KL distance

Mutual information can be rewritten as:

This distance is zero when P(X,Y)=P(X)P(Y), i.e., when X and Y are independent.

))()(|),((

,log,, 2

YPXPYXPKL

YPXPYXP

YXPYXI

Page 42: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Measuring entropy from data

Consider a population of 100 neurons firing for 100ms with 1 ms time bins. Each data point is a 100x100 binary vector. The number of possible data point is 2100x100. To compute the entropy we need to estimate a probability distribution over all these states…Hopeless?…

Page 43: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Direct Method

• Fortunately, in general, only a fraction of all possible states actually occurs

• Direct method: evaluate P(A) and P(A|) directly from the the data. Still require tons of data but not 2100x100…

Page 44: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Upper Bound

• Assume all distributions are gaussian

• Recover SNR in the Fourier domain using simple averaging

• Compute information from the SNR (box 3)

• No need to recover the full P(R) and P(R|S) because gaussian distributions are fully characterized by their mean and variance.

Page 45: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Lower Bound

• Estimate a variable from the neuronal responses and compute the mutual information between the estimate and the stimulus (easy if the estimate follows a gaussian distribution)

• The data processing inequality guarantees that this is a lower bound on information.

• It gives an idea of how well the estimated variable is encoded.

Page 46: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Mutual information in spikes

• Among temporal processes, Poisson processes are the one with highest entropy because time bins are independent from one another. The entropy of a Poisson spike train vector is the sum of the individual time bins, which is best you can achieve.

Page 47: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Mutual information in spikes

• A deterministic Poisson process is the best way to transmit information with spikes

• Spike trains are indeed close to Poisson BUT they are not deterministic, i.e., they vary from trial to trial even for a fixed input.

• Even worse, the conditional entropy is huge because the noise follows a poisson distribution

Page 48: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

The choice of stimulus

• Neurons are known to be selective to particular features.

• In information theory terms, this means that two sets of stimuli with the same entropy do necessarily lead to the same amount of mutual information in the response of a neuron.

• Natural stimuli often lead to larger mutual information (which makes sense since they are more likely)

Page 49: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Information Theory: Pro

• Assumption free: does not assume any particular code

• Read-out free: does not depend on a read-out method (direct method)

• It can be used to identify the features best encoded by a neurons

Page 50: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

Information theory: Con’s

• Does not tell you how to read out the code: the code might be unreadable by the rest of the nervous system.

• Data intensive: needs TONS of data

Page 51: BCS547 Neural Encoding. Introduction to computational neuroscience 10/01Neural encoding 17/01Neural decoding 24/01Low level vision 31/01Object recognition.

nature neuroscience • volume 2 no 11 • november 1999

Animal system Method Bits per second Bits per spike High-freq. cutoff(Neuron) (efficiency) or limiting spikeStimulus timingFly visual10 Lower 64 1 2 ms(H1)Motion

Fly visual15 Direct 81 — 0.7 ms(H1)Motion

Fly visual37 Lower and 36 — —(HS, graded potential) upperMotion 104

Monkey visual16 Lower and 5.5 0.6 100 ms(area MT) directMotion 12 1.5

Frog auditory38 Lower Noise 46 Noise 1.4 750 Hz(Auditory nerve) Call 133 ( 20%)Noise and call Call 7.8 ( 90%)

Salamander visual50 Lower 3.2 1.6 (22%) 10 Hz(Ganglion cells)Random spots

Cricket cercal40 Lower 294 3.2 > 500 Hz(Sensory afferent) ( 50%)Mechanical motion

Cricket cercal51 Lower 75–220 0.6–3.1 500–1000 Hz(Sensory afferent)Wind noise

Cricket cercal11,38 Lower 8–80 Avg = 1 100–400 Hz(10-2 and 10-3)Wind noise

Electric fish12 Absolute 0–200 0–1.2 ( 50%) 200 Hz(P-afferent) lowerAmplitude modulation