Improved Models for Analysis of Motor-Cortical Signals Alex L. Rojas * , Anthony E. Brockwell † and Andrew B. Schwartz ‡ Abstract In recent years, devices capable of linking the brain to the external world have been devel- oped. Such devices directly measure the output of multiple neurons simultaneously. One obvious application of this technology is in helping those who are movement impaired; in theory it would be possible to implant such a device in the brain, and use its output to control movement of a robotic prosthetic limb, for instance. However, to achieve this goal, it is necessary to first understand the relationship between movement and neural signals. We consider data collected from rhesus monkeys in experiments, and propose a model for describing this relationship. The model generalizes several previously-considered models from the neuroscience literature, and al- lows individual neurons to (1) encode different kinematic variables, and (2) to have more general spike count distributions. The proposed model is used to decode cortical signals recorded for 258 neurons in the ventral premotor cortex of rhesus monkeys during an ellipse-drawing task, and we demonstrate that relative to the existing models, a substantial reduction in mean squared error is achieved. Keywords: Bayesian decoding, neuron, spike, model selection 1 Introduction The search for accurate descriptions of the relationship between spike trains and movement is a fundamentally important problem in neuroscience. Not only could these descriptions, formalized as probabilistic models, give us a better understanding of the underlying processes in the neural * Department of Statistics, and Center for Automatic Learning and Discovery, Carnegie Mellon University. † Department of Statistics, Carnegie Mellon University. ‡ Department of Neurobiology, University of Pittsburgh 1
26
Embed
Improved Models for Analysis of Motor-Cortical SignalsImproved Models for Analysis of Motor-Cortical Signals Alex L. Rojas, Anthony E. Brockwellyand Andrew B. Schwartzz Abstract In
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Improved Models for Analysis of Motor-Cortical
Signals
Alex L. Rojas ∗, Anthony E. Brockwell†and Andrew B. Schwartz‡
Abstract
In recent years, devices capable of linking the brain to the external world have been devel-
oped. Such devices directly measure the output of multiple neurons simultaneously. One obvious
application of this technology is in helping those who are movement impaired; in theory it would
be possible to implant such a device in the brain, and use its output to control movement of
a robotic prosthetic limb, for instance. However, to achieve this goal, it is necessary to first
understand the relationship between movement and neural signals. We consider data collected
from rhesus monkeys in experiments, and propose a model for describing this relationship. The
model generalizes several previously-considered models from the neuroscience literature, and al-
lows individual neurons to (1) encode different kinematic variables, and (2) to have more general
spike count distributions. The proposed model is used to decode cortical signals recorded for 258
neurons in the ventral premotor cortex of rhesus monkeys during an ellipse-drawing task, and
we demonstrate that relative to the existing models, a substantial reduction in mean squared
error is achieved.
Keywords: Bayesian decoding, neuron, spike, model selection
1 Introduction
The search for accurate descriptions of the relationship between spike trains and movement is a
fundamentally important problem in neuroscience. Not only could these descriptions, formalized
as probabilistic models, give us a better understanding of the underlying processes in the neural
∗Department of Statistics, and Center for Automatic Learning and Discovery, Carnegie Mellon University.†Department of Statistics, Carnegie Mellon University.‡Department of Neurobiology, University of Pittsburgh
1
system, but they have potential applications in areas including development of neurally-controlled
prosthetic devices, restoration of functionality lost due to neural injury, etc.
The relationship between movement and spike trains is often referred to as neural coding, and
is typically specified by a model which gives the probabilistic (spiking) behavior of a neuron as
a function of current, past, and potentially, future movement. The use of probability is natural
since cortical signals are typically described by stochastic models due to their inherent variability.
The converse problem of determining motion, given observed spike trains, is referred to as neural
decoding, and has been heavily studied in the literature. There are basically two groups of decoding
algorithms in the literature: the “population vector” (PV) approach (Georgopoulos et al., 1983;
Taylor et al., 2002; Salinas and Abbott, 1994) and Bayesian decoding algorithms (Brockwell et al.,
2004; Brown et al., 1998; Gao et al., 2002; Shoham et al., 2003; Wu et al., 2002). The former
algorithms are easily implemented, but they are unable to account for dynamic behavior of the
underlying signal of interest (Brockwell et al., 2004), they lack a clear probabilistic model, and
they provide no estimate of uncertainty. On the other hand, Bayesian decoding offers a broad and
flexible model-based approach. It starts with an encoding model, and then “turns the problem
around,” obtaining (via standard recursions involving Bayes’ rule) distributions of kinematic vari-
ables conditional on observations of spike trains. In the past, this approach has not been widely
used, due to the complexity of its implementation. Brockwell et al. (2004) and Gao et al. (2002)
considered the use of a recently-developed numerical method known as “particle filter” (Doucet
et al., 2001) to overcome the complexity of implementation of Bayesian decoding methods when
the models of cortical activity for encoding (modeling) of movement are non-linear or non-Gaussian
(see Section 2). Brockwell et al. (2004) performed a simulation study in which they showed how
the particle filter algorithm can lead to more accurate decoding than the decoding obtained by PV
and “optimal linear estimation” (OLE, Salinas and Abbott, 1994) methods. When applied to real
data, the particle filter also outperformed the PV and OLE methods.
The reliance of the recursive Bayesian approach on the model can be either a strength or a
weakness. When the model is correct, results are optimal (in the sense that conditional means
minimize mean-squared-error), but when the model is not correct, bias and/or noise may be intro-
2
duced into decoded trajectories. Therefore, assessment of model goodness-of-fit is critical in such
a decoding scheme. (This point is recognized by others, such as Paninski et al., 2004, who are also
working to develop better encoding models.) In this paper we study the appropriateness of some of
the models that have been used in the literature, with particular reference to factors that have been
recently included in neuron firing models, such as position, acceleration (Wu et al., 2002) and speed
(Brockwell et al., 2004). In addition, we describe and analyze the conditional distributions of the
cortical signals. This leads us to propose a general model, which is more flexible than many of those
previously considered. The model we propose differs from that of Paninski et al. (2004) in that
ours allows neuron firing rates to depend on arbitrary subsets of kinematic variables, with a more
flexible functional form. On the other hand, the model of Paninski et al. (2004) has the advantage
of being able to capture more correlation between spiking in different neurons. We also propose
several diagnostics and tests which can be used for model-fitting, and finally, we examine whether
or not these new factors help us to decode hand-velocity from cortical signals more accurately.
2 Motor Cortex Data
To study the model specification, we use a center→out task (Moran and Schwartz, 1999a; Taylor
et al., 2002) and an ellipse-drawing task (Reina and Schwartz, 2003).
2.1 Description of Experimental Setup and Data
The basic methodology for the center→out task in two dimensions is described by Moran and
Schwartz (1999a). In the three-dimensional (3D) case, three rhesus monkeys (Maccaca mullata)
were trained to perform hand movements in a computer-generated 3D virtual environment. The
monkeys could not see their actual hand movements, but rather saw two spheres: the motion of the
first sphere was controlled by the monkey’s hand position, and the second sphere was a stationary
target sitting on a vertice of an imaginary cube, creating a total of eight possible targets. Each
monkey performed the same task five times, for each randomly selected target. The cortical activity
(spikes) and the hand position, as functions of time, were recorded for 258 neurons in the ventral
premotor cortex (area 6V), for five repetitions of the experiment. These firing counts were recorded
one neuron at a time as opposed to hundreds of neurons at a time (Black et al., 2003; Paninski et al.,
2004). The total times recorded were divided into 100 equal-sized bins, in such a way that the actual
3
reaching task was performed between time bins 31 and 71. Two of the three monkeys carried out the
experiment with both arms (one after the other), and the third monkey did it only with its left arm.
After conducting the center→out task, the ellipse-drawing task was conducted by the same
three monkeys using the same arms. Specifically, each monkey continually traced five elliptical
loops in the x-y plane, with the z-component capturing small deviations of the hand from the x-y
plane. During the task, cortical activity and hand position were recorded for the same 258 neurons
recorded in the center→out task. As before, the duration of each of the five loops was divided into
100 equal-size bins and the whole experiment was conducted five times. A more detailed description
of the ellipse-drawing task can be found in Reina and Schwartz (2003).
Let N(i)t be the spike count for the ith neuron in the tth time bin and N (i) =
∑Tt=1 N
(i)t
be the total spike count recorded for the ith neuron, where T is the number of time bins. Let
p(i)t =
(
p(i)x,t, p
(i)y,t, p
(i)z,t
)′be the smoothed vector of hand positions in the (x, y, z) direction for the ith
neuron at time t. This smoothing was performed using a cubic spline and it was done to facilitate
calculating hand velocity and acceleration. Let v(i)t =
(
v(i)x,t, v
(i)y,t, v
(i)z,t
)′be the vector of velocities at
time t in the (x, y, z) directions for the ith neuron calculated by taking the first derivative of the
smoothed hand positions. Let a(i)t =
(
a(i)x,t, a
(i)y,t, a
(i)z,t
)′be the vector of accelerations at time t in the
(x, y, z) directions for the ith neuron calculated by taking the second derivative of the smoothed
hand positions.
Define the vector of kinematic variables for the ith neuron as
x(i)t,k =
(
p′(i)t ,v′(i)
t ,a′(i)t , s
(i)t
)
with s(i)t the speed (or tangential velocity) defined as
s(i)t = ||v
(i)t ||2 =
√
v(i)2x,t + v
(i)2y,t + v
(i)2z,t .
Finally, we define λ(i)(·) as the observed spike intensity function, for the ith neuron. This
function is obtained by smoothing the observed spike counts using loess (Cleveland and Devlin,
1988). (Note that λ(i)(·) represents the real-valued rate of firing of the ith neuron, which is different
from the integer-valued count process N(i)t .)
4
2.2 Features of the data
Our first step toward a better encoding of cortical signals is to identify the behavior of the vari-
ability of spike counts. We start by checking whether or not the firing rate for each neuron follows
an inhomogeneous Poisson process, since it has been the main assumption in early studies (Brock-
well et al., 2004; Gao et al., 2003). We design a goodness-of-fit test based on the time-rescaling
theorem (see Appendix A and Brown et al., 2001), the probability integral transform (Casella and
Berger, 2002), the Kolmogorov-Smirnov (KS) test and the Quantile-Quantile (QQ) plot (Johnson
and Kotz, 1970). The idea is to examine whether or not, after time-rescaling so that the process is
homogeneous, the inter-spike times have exponential distributions Details are given in Appendix B.
The KS test and the QQ plot for neuron #156 is displayed in Figure 1. If the spike counts
for neuron #156 were a realization of an inhomogeneous Poisson process with intensity function
λ(i)(·), then we would expect the solid line in Figure 1 to be close to a 45-degree line and stay
within the dashed and dotted lines, according to the KS test and the QQ plot. As can be seen
in Figure 1, there is lack of fit for lower quantiles (below 0.42 for the QQ plot and 0.30 for the
KS test); therefore, the Poisson assumption appears not to be valid for this neuron. The behavior
found in neuron #156 is also found in many other neurons, but not all of them; hence, based on
the KS test and the QQ plot, we conclude that the Poisson assumption is not reasonable for all
neurons and a new model should be proposed. We propose to use the Bernoulli distribution as an
alternative, since the spike counts for most of the neurons that lack fit using the KS test and the
QQ plot typically have at most one spike in each bin.
3 Modeling the Data
There have been a variety of proposed models to describe the variability of neuron firing and the
neuron’s expected firing rate. They mainly differ in the set of kinematic variables included, and
the way the response variable is handled. The first proposed model (Ashe and Georgopoulos,
1994) uses only velocity and avoids the use of counts by using the second order variance stabilizing
transformation for a Poisson variable (Anscombe, 1948). In other words, if Nt ∼ Poisson(λ) then
Varλ
(
√
Nt +3
8
)
≈1
4·
(
1 +1
16 · λ2
)
5
0 0.5 10
0.5
1
Model Quantiles
Empi
rical
Qua
ntile
s
Figure 1: KS test (dotted line) and QQ plot (dashed line) for the interspike times of neuron #156.
The fact that the curve exceeds the dashed line indicates that a test of size 0.05 would reject the
null hypothesis of a Poisson distributed spike count.
Brockwell et al. (2004) used a different approach; they worked directly with spike counts and
made use of generalized linear models (McCullagh and Nelder, 1989) to estimate the parameters in
the model. Their model includes velocities and speed as covariates, as well as time lags. Shoham
et al. (2003) define their own distribution to describe the variability of the spike counts. They
called it the “Normalized-Gaussian discrete” distribution. Their proposal includes the use of linear
combinations of position, velocity, acceleration plus the use of a 4th degree polynomial to model
the expected firing rate. Their model does not include time lags. Gao et al. (2003) proposed the use
of generalized additive models (Hastie and Tibshirani, 1990), allowing a more flexible description
of the expected firing rate as a function of position, velocity and acceleration.
We can generalize the models mentioned before as follows. For modeling the expected firing rate
we adopt the approach of Moran and Schwartz (1999a), Wu et al. (2003) and Shoham et al. (2003),
including in the model other kinematic variables as position and acceleration, as well as speed.
These inclusions are made given that different neurons encode for different kinematic variables.
For example, some encode primarily velocity while others encode position (Shoham et al., 2003).
Given that following neural discharge there is a time lag before movement occurs (or in some cases,
6
the movement occurs before associated neural discharge), we also include a time lag term τ (i). In
summary, for the ith neuron we propose the model
N(i)t ∼ p(n
(i)t |r
(i)t ), (1)
r(i)t = f (i)(βi · x
(i)
t+τ (i)), (2)
where p(n(i)t |r
(i)t ) denotes the conditional distribution of n
(i)t given r
(i)t , with r
(i)t , the expected
firing rate. The f (i) function (namely the inverse of what it is called ‘link’ function in the
generalized linear model literature) represents the non-linear relationship between the expected
firing rate r(i)t , which is a linear combination of kinematic variables. Finally, x
(i)t is a subset
of [pt,vt,at, st]T (i). As a special case, for example, if p(·|r
(i)t ) is Poisson, f(·) = exp (·), and
r(i)t−τ = β0 + β1vx,t + β2vy,t + β3vz,t + β4st, we obtain the model used by Brockwell et al. (2004)
As mentioned before, some neurons seem to have a variability described by the Poisson distri-
bution, while other neurons appear to have more Bernoulli-like distributions. Therefore we include
these two possibilities for p(·|r(i)t ), but in order to use the Bernoulli distribution, we would need, in
a few cases, to transform the spike counts Nt as follows
N(i)∗t ≡ min (1, N
(i)t ). (3)
Assuming that the conditional mean r(i)t or its transformation through (f (i))−1(·) is a linear
function of the hand kinematics, we use generalized linear models (GLMs, McCullagh and Nelder,
1989) to estimate the parameters βi for each neuron. The parameters in a GLM can be estimated
by the maximum likelihood method using iterative re-weighted least squares (IRLS). To carry out
our analysis we make use of the glm function in Splus.
It is important to mention that the IRLS method is not guaranteed to converge to the global
maximum of the likelihood function, as it is the case for some of the models we fitted in this study.
This situation may be due to the sparsity of spike counts for some neurons; therefore, any neuron
with this problem was not included in our study.
7
3.1 Automatic per-Neuron Model Choice
The main goal of this study is to find a “better” model, or set of models, for encoding cortical
signals. We do so by first taking the generalization given in Equations (1) and (2), and exploring
different selections for the conditional distribution of spike counts given the expected firing rate,
p(·|rt); the link function f−1; the time-lag, τ ; and the set of kinematic variables, x. These selections
have to be made for all available neurons, so we need to define an automatic procedure to select
each component of our model.
Based on the preliminary diagnostics considered in Section 2, our choices of allowable distri-
butions p(·|r(i)t ) in Equation (1) are the Bernoulli distribution and the Poisson distribution, with
f(·), the inverse of the corresponding link function, that is, f(x) = exp(x) for the Poisson model
and f−1(x) = logit(x) for the Bernoulli model. Choice of p(·|r(i)t ) for each neuron is described in
Section 3.1.1, while choice of τ and X is explained in Sections 3.1.2 and 3.1.3, respectively.
3.1.1 Distribution Selection
To determine whether the Bernoulli or Poisson distribution is more appropriate, in a manner which
is easily automated, we use the following approach.
First, for the ith neuron, let λ(i)l and λ
(i)u be the minimum and the maximum,respectively, of
λ(i)(·) in the interval (0, b(i) ·T ). Let us divide the interval [λmin, λmax] into M bins, {B1, . . . , BM},
each bin with approximately the same number of elements. We define µ(i)k as the mean of the
smoothed values in the kth bin, k = 1, . . . ,M , and S(i)2k the conditional sample variance of the
spike counts for all time bins t such that λ(i)t ∈ Bk. Based on µ
(i)k and S
(i)2k we defined the statistic
C(i) as
C(i) =
M∑
k=1
w(i)k
{
(µ(i)k (1− µ
(i)k )− S
(i)2k )2 − (µ
(i)k − S
(i)2k )2
}
(4)
where
w(i)k =
√
µ(i)k + 1/M
M∑
k=1
√
µ(i)k + 1
,
8
for k = 1, . . . ,M .
The statistic C is comparing the squared deviations between the expected conditional variance
under the Poisson and Bernoulli models, giving more weight to deviations for high estimated firing
rates. The reason for this choice is that for small values, the Poisson and Bernoulli conditional
variance functions are similar. From the definition of C (i), we would expect positive values of C (i)
for Poisson-like neurons and negative values of C (i) for Bernoulli-like neurons. A formal distribution
selection procedure using this statistic is given in Figure 2.
Input: an integer K > 0, the intensity function for the ith neuron λ(i)(·), T bins were the cortical
activity was recorded.
(i) For each of the T bins, generate a Poisson sample of size one, with rate λ(i)t , t = 1, . . . , T .
(ii) Compute the statistics C (i) (see Equation (4)) based on the sample generated in (i), name it
C(i)P .
(iii) For each of the T bins, generate a Bernoulli sample of size one, with probability λ(i)t , t =
1, . . . , T .
(iv) Compute the statistics C (i) based on the sample generated in (iii), name it C(i)B . Let i = i+1.
(v) Repeat (i)-(iv) K times.
(vii) Choose the Poisson distribution if f̂p(C(i)∗ ) > f̂b(C
(i)∗ ), where f̂b is the kernel density estimate
(Silverman, 1986) of f(C(i)∗ ) based on C
(i)B,1, . . . , C
(i)B,K and, f̂p is the kernel density estimate
of f(C(i)∗ ) based on C
(i)P,1, . . . , C
(i)N,K . Otherwise choose the Bernoulli distribution.
Output: selected density function
Figure 2: Procedure to select p(·|r(i)t )
9
3.1.2 Time Lags
The physical relationship between hand movement and neuron firing implies the existence of a
(possibly small) time lag between the firing and the movement, referred to as τ (i) in Equation (2).
To identify “the” lag for each of the neurons, we fit each model for different values of lag, τ =
−40, . . . ,−1, 0, 1, . . . , 40, and choose the value that maximizes the statistic
D∗τ = ητ (Dτ ) = ητ
(
D0,τ −D1,τ
D0,τ
)
, (5)
in the center→out task and Dτ in the ellipse-drawing task. D0,τ is the null deviance
D0,τ = 2(l(N(i)t ;N
(i)t )− l(N
(i);N
(i)t )),
where N(i)
is the spike counts mean. D1,τ is the deviance
D1,τ = 2(l(N(i)t ;N
(i)t )− l(r̂
(i)t ;N
(i)t )),
where r̂(i)t is the estimated expected tunning function. Finally,
ητ (x) =
x if− 20 < τ < 20(
2−|τ |
20
)
· x otherwise.(6)
For models where the Poisson distribution is used, the log-likelihood is
l(r̂(i)t ;N
(i)t ) =
T∑
t=1
N(i)t log (r̂
(i)t )− r̂
(i)t − log (N
(i)t !) (7)
with i = 1, . . . , 258 and T = 100. For models where the Bernoulli distribution is used, we assume
that for each time bin, we record whether or not spike counts during this time are observed.
Therefore, the log-likelihood can be written as:
l(r̂(i)t ;N
(i)t ) =
T∑
t=1
N(i)t
∗log r̂
(i)t + (1−N
(i)t
∗) log (1− r̂
(i)t ) (8)
with N(i)t
∗= min {N
(i)t , 1}, i = 1, . . . , 258 and T = 100.
Note that we do not use the likelihoods of each model to compare them because for each lag
there is a different amount of data. On the other hand, we are not comparing nested models;
10
therefore, we do not use the deviance D1 alone. Instead, we compare the “gain” in including the
predictive variables p(i)t ,v
(i)t ,a
(i)t , and s
(i)t in our model for different lag values. The function η is
introduced to eliminate large values of τ in the center→out task because, for these values, a high
value of Dτ is probably an artifact of the small amount of data left for high time lags.
3.1.3 Kinematic Variables
Our model does not fix the subset of kinematic variables that we should include, as most current
models do. This added flexibility generally leads to better models, since, as observed in the litera-
ture (see, e.g., Shoham et al., 2003; Moran and Schwartz, 1999a) some neurons appear to code for
position while other encode for velocity. However, with this added flexibility, we have to come up
with a procedure to decide which subset of kinematic variables to include, a task that it is not trivial.
We use the following procedure for selection of kinematic variables. For each value of τ =
−40, . . . ,−1, 0, 1, . . . , 40, the GLM is fitted and the significant variables are recorded. Then, we
find the proportion of times that each kinematic variable was significant. Finally, we take the
subset of kinematic variables that were significant more than κ% of the times. We choose κ = 40
in an ad hoc manner. If there is no variable significant more than 40% of these times, we just take
the variable that was significant more often than the others. Note that if a variable is significant in
only one or two directions (x, y, or z), we included all the variable of the same “class”; for instance,
if position in the x-direction is selected, position in the y and z−directions are also included. We
are now able to specified our automatic model choice procedure as appears in Figure 3.
3.1.4 Example: The selection procedure applied to neuron #156
To get a better understanding of the procedure described in Figure 2, we apply it to neuron #156.
We start by deciding between the Poisson and the Bernoulli distributions. The kernel density
estimates of f(c) obtained by applying the procedure described in Section 3.1.1 appear in Fig-
ure 4(a). As can be seen in Figure 4(a), we select the Bernoulli assumption for neuron #156
because 0.2446 = f̂p(C∗) = f̂p(−0.5024) < f̂b(−0.5024) = 2.6934. Note that this result was ob-
tained using the center→out task, but exactly the same decision is made using the ellipse-drawing
task.
11
1. Choose p(·|r(i)t ) using the procedure defined in Section 3.1.1.
2. Select the set of significant kinematic variables using the procedure defined in Section 3.1.3,
given that p(·|r(i)t ) has been chosen to be Bernoulli or Poisson.
3. Estimate time lag, τ :
Select τ̂ such that, τ̂ = argmaxτ∈{−40,...,40}D∗τ , for the center→out task or τ̂ =
argmaxτ∈{−40,...,40}Dτ , for the ellipse-drawing task, where
D∗τ = η(Dτ ) = η
(
D0,τ −D1,τ
D0,τ
)
calculated from the GLM with the chosen p(·|r(i)t ) in 1., and the set of kinematic variables
selected in 2.
Figure 3: Selection procedure
Given that we have selected p(·|r(i)t ), we follow the approach described in Section 3.1.3 and
evaluate the percentage of times a kinematic variable is significant for values of τ = −40, . . . , 40.
These percentages appear in Table 1 for both center→out and ellipse-drawing tasks. From the
results in Table 1, only velocity is included for both data sets, while position is only included in
the ellipse-drawing data set.
Table 1: Percentage of times a kinematic variable is significant in the GLM, for neuron #156 and
values of τ = −40, . . . , 40
Variable center→out task ellipse-drawing task
Position 32 49
Velocity 62 46
Acceleration 19 16
Speed 5 9
12
C
Dens
ity
−2 −1 0 1 2 3 40
1
2
3
4
C*
(a) Kernel density estimates of f(c) based on
Cb1 , . . . , Cb
K (solid line) and Cp1 , . . . , C
p
K (dashed
line)
−40 −20 0 20 400
0.03
0.06
0.09
0.12
0.15
τ
Dτ
D*τ
Dτ
(b) Values of Dτ for the center→out task and
D∗τ for the ellipse-drawing task
Figure 4: Distribution and lag selection for neuron #156
Finally, having selected the kinematic variables, we choose the lag value. As mentioned before,
we choose τ such that the value of D∗τ , in the case of the center→out task, or Dτ for the ellipse-
drawing task, is maximized. Figure 4(b) displays the values of D∗τ and Dτ , plus a smoothing curve
of these values. The lags selected are 20 for the center→out task and 1 for the ellipse-drawing
task. Notice how Dτ is always greater than D∗τ . On one hand, the difference is due to the fact
that Dτ was calculated using position and velocity, while D∗τ was calculated using only velocity.
On the other hand, the gain of including kinematic variables in the GLM for the ellipse-drawing
task is greater than in the center→out case. It is also clear that the curve’s shape is different for
both tasks; this difference and the fact that different sets of kinematic variables appear significant
for these tasks lead us to conclude that neurons may not encode the same information always, and
their time lags do not remain constant for different tasks or sets of kinematic variables. (The fact
that time lags vary was noted by Moran and Schwartz, 1999b, who observed that time lags depend
of the inherent curvature of a particular task.)
13
4 Results
4.1 Summary of the automatic selection procedure for the Rhesus data
Fifty-seven percent and seventy percent of neurons were classified as Bernoulli distributed, 43%
and 30% were classified as Poisson distributed in the center→out task and ellipse-drawing task,
respectively. These distributions of p(·|r(i)t ) establish a majority of Bernoulli-like neurons for both
tasks. This situation is very interesting given that the most commonly used distribution is Poisson.
On the other hand, the agreement between the chosen p(·|r(i)t ) for each task is 68%; therefore, we
may expect to go through the process of identifying p(·|r(i)t ) for each new task.
Regarding time lags, the difference between the two task are quite substantial. As can be seen
in Figure 5, they seem not to be related at all. Furthermore, no relationship between p(·|r(i)t ),
monkey, arm and time lag was found.
−40 −20 0 20 40−30
−20
−10
0
10
20
30
Time Lag. Ellipse−drawing task
Tim
e La
g. C
ente
r−ou
t tas
k
Figure 5: Scatter plot of Time Lags estimated using the ellipse-drawing task vs. Time Lags
estimated using the center→out task
Finally, the proportion of neurons for which each kinematic variable is significant appear in Table 2.
Notice how more neurons encode position than velocity in the center→out task, while in the ellipse-
14
Table 2: Percentage of neurons for which each kinematic variable is significant
center→out task ellipse-drawing task
Variable % %
p 34 39
v 23 43
a 23 27
s 25 10
drawing task, the number of neurons encoding velocity is a little higher than the neurons encoding
position. Another important feature is that only few neurons seem to encode speed, therefore this
variable will barely be part of the model, which contrasts to the model proposed by Brockwell et al.
(2004).
4.2 Decoding
As mentioned before, there are basically three sets of decoding algorithms. In this paper, we
focus on Bayesian Decoding using particle filtering, since it has been shown that particle filtering
performs better than other methods (Brockwell et al., 2004; Gao et al., 2003). This method is
briefly explained in Appendix C.
4.2.1 The model
The unobserved kinematic variables (hidden states) {Xt, t = 0, 1, 2, . . .} are modeled as a Markov
process of initial distribution p(X0) with a transition equation p(Xt|Xt−1). The cortical sig-
nals {Nt, t = 0, 1, 2, . . .} are assumed to be conditionally independent given the process {Xt, t =
0, 1, 2, . . .} and the marginal distribution p(Nt|Xt). In summary, the basic model is described by
p(X0)
p(Xt|Xt−1) for t ≥ 1 (9)
p(Nt|Xt) for t ≥ 1. (10)
15
The fundamental object is to find the posterior distributions, p(Xt|N0,N1, . . . ,Nt), t = 1, 2, . . ..
This posterior distributions are typically impossible to find analytically. To address this problem,
numerical approximations are needed. One numerical approximation is the Particle Filter (PF),
also know as “Bootstrap Filter”(Doucet et al., 2001), which is applicable to a large class of models
and is easy to implement. This algorithm can be seen in Figure 8.
4.2.2 Decoding of cortical signals
Given that our final goal is to improve the decoding of cortical signals, we analyze the decoding
accuracy of hand velocity using the model proposed by Brockwell et al. (2004) (M1) and the model
we proposed (M2), by comparing the integrated square error (ISE) of the reconstruction of the
original velocity, in the fourth loop of the first repetition, for the ellipse-drawing task. Since each
neuron was recorded separately, there are 258 different velocities. We average all of them to get
the actual velocity. Finally, we assume that the vector of velocities follows a random walk.
For M1 all neurons use the same set of kinematic variables and all assume the Poisson distri-
bution, then the state-space model is as follows:
State equation
vedt = ved
t−1 + εt (11)
Observation equation
Nt =
N(1)t
...
N(n)t
, N(j)t |xt ∼ Poisson(fi(xt)) (12)
where εt ∼ N3(0, Q), fi(xt) = exp{
β̂0 + β̂1v(i)x + β̂2v
(i)y + β̂3v
(i)z + β̂4s
(i)t
}
, β̂j , j = 0, . . . , 4. are
found use the three first loops of the ellipse-drawing task, using IRLS.
For M2 the state-space model is
16
State equation
pt
vt
at
=
I δI 0
0 I 0
0 0 0
·
pt−1
vt−1
at−1
+
0
εt
εt/δ
(13)
Observation equation
Nt =
N(1)t
...
N(n)t
, N(j)t |xt ∼ pi(fi(xt)) (14)
where pi, fi, and x(i)t are chosen using the selection procedure defined in Section 3.1.3, and εt ∼
N3(0, Q).
fi(xt) =exp {β̂ix
(i)}
1 + exp {β̂ix(i)}
, if pi is Bernoulli. (15)
fi(xt) = exp{
β̂x(i)}
, if pi is Poisson. (16)
β̂ =(
β̂0, . . . , β̂p
)
is the GLM estimate of β in M2, using the three first loops of the ellipse-drawing
task, with p being the column dimension of x.
Notice that the estimations of β are only based on the ellipse-drawing task, as opposed to
including information from the center→out task (Brockwell et al., 2004). This seems more natural
since it is not clear how to introduce information from one task to another one. Plus, as mentioned
before, neurons seem to encode different information for each task. Finally, even though we have
a total of 258 neurons, we choose only the set of neurons such that the vector of parameters
(β1, . . . , βp) is significantly different from the p-dimensional vector of zeros, for a total of 118
neurons.
4.2.3 Decoding results
Results from applying the particle filter with the two different models are shown in Figure 6, and
the ISE and the maximum squared error (maxSE) are given in Table 3. Notice how the recon-
struction based on the PF algorithm and M2 is visually closer to the true trajectory than the one
using M1; with its accuracy, in terms of mean-squared error, improved by a factor of approximately
17
Table 3: Decoding errors summarized across time, for M1 and M2
M1 M2
ISE 4.6864 1.3997
maxSE 0.1350 0.0374
three. This result may be interpreted, as was done by Brockwell et al. (2004), as saying that ∼
3 times more neurons would be needed when using the PF algorithm and M1 to obtain the same
reconstruction error achieved when using the PF algorithm and M2.
The improvement obtained by using M2 seems mainly due to the explicit specification of the
variability among neurons in terms of the hand kinematics that they encode and the probabilistic
and deterministic relationships they have with the spike trains.
Finally, it is worth mentioning that the PF using M2 is less sensitive to the estimated time lag
than the PF using M1. The reason for this phenomenon is unknown, but we suspect that it is due
to the fact that we fix the subset of kinematic variables in M1.
5 Conclusions
We have seen (confirming the results of a number of prior studies) that neurons in the ventral pre-
motor cortex behave differently. Not only do they exhibit different time lags, but they also encode
different information relevant to hand movement, and the variability of spike count distributions
is also different. This non-constant behavior is also found when different tasks are performed. By
accounting for this behavior and making model distinctions accordingly for each neuron, we are able
to achieve more accurate decoding of hand position from motor-cortical signals. Further decoding
improvement may be obtained by means of using more flexible functions in our model, for instance,
splines or polynomials. In addition, following in the spirit of Paninski et al. (2004), it may be useful
to adapt our model to handle correlation (beyond simply correlation induced by neurons having
similar “preferred directions”) between neurons as well.
18
−100 0 100−150
−100
−50
0
50
100
150
x
y
−50 0 50−80
−60
−40
−20
0
20
40
6080
z
y
−100 0 100
−100
−50
0
50
100
z
x
0 50 100−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6Actual and PF Velocity, Model 1
Time Bin
Vel
ocity
0 50 100−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6Actual and PF Velocity, Model 2
Time Bin
Vel
ocity
0 50 1000
0.02
0.04
0.06
0.08
0.1
0.12
0.14Squared Errors in Velocity
Time Bin
SE
M2M1
zxy
zxy
ActualPF M1PF M2
ActualPF M1PF M2
ActualPF M1PF M2
Figure 6: Decoding of x, y, and z components of the 4th loop of the ellipse-drawing task, based on
258 neurons in the ventral premotor cortex. A,B,C: Actual position trajectory along with decoded
trajectories; D, E: actual velocity and decoded velocity for the PF algorithm, over the 100 time
bins in the 4th loop using M1 and M2, respectively; and F: squared errors of decoded velocities for
the PF algorithm using M1 and M2.
19
6 Acknowledgements
This work was partially supported by NSF Grant IIS-083148.
20
A Time-Rescaling Theorem.
Let 0 < s1 < s2 < . . . < sn < T be a realization from a point process with a conditional intensity
function r(t) satisfying 0 < r(t) for all t ∈ (0, T ]. Define the transformation
Λ(sj) =
∫ sj
0r(u)du, (17)
for j = 1, . . . , n, and assume Λ(t) <∞ with probability one for all t ∈ (0, T ]. Then the Λ(sj)′s are
a Poisson process with unit rate.
Given that the Λ(sj)′s are a Poisson process with unit rate, if we define the inter-spike times
as τk = Λ(sk+1) − Λ(sk), for k = 1, . . . , n − 1; thus, τk ∼ Exponential(1) (see e.g., Taylor and
Karlin, 1998). Now, by the probability integral transform, zk defined as 1 − exp (−τk) follows a
uniform distribution, that is, zk ∼ Uniform(0, 1) for k = 1, ..., n − 1.
B Kolmogorov-Smirnov test and Quantile-Quantile plot
We use the Kolmogorov-Smirnov test and Quantile-Quantile plot to test goodness-of-fit of the spike
data model that assumes an inhomogenuous Poisson process. These tests are applied to one neuron
at the time and are described in Figure 7.
21
(i) Let r(·) = λ(·) in Equation (17), with λ(·) the observed intensity function defined in Section 2.
(ii) Calculate z1, . . . , zN−1.
(iii) Find the order statistics z(1), . . . , z(N−1).
(iv) Compute the cumulative distribution function of the uniform density defined as bk =j − 1
2
N − 1,
(v) Plot {bk}N−1k=1 versus {z(k)}
N−1k=1 .
(vi) Add the confidence bounds bk ±1.63
(N − 1)1/2, which correspond to the Kolmogorov-Smirnov
test
(vii) Add the 2.5th and 97.5th percentiles of the beta density with parameters k and N − k, which
correspond to the Quantile-Quantile plot.
Figure 7: Kolmogorov-Smirnov Test and Quantile-Quantile plot
22
C Particle Filter Algorithm
1. Initialize: t = 0.
2. For i = 1, . . . , N , sample X0(i) ∼ p(X0)
3. Importance sampling step:
For i = 1, . . . , N , sample X̃t
(i)∼ p(Xt|Xt−1) and set
(X̃0
(i), . . . , X̃t
(i)) = (X0
(i), . . . ,Xt−1(i), X̃t
(i))
For i = 1, . . . , N , evaluate the importance weights
w̃(i)t = p(Nt|X̃1
(i), X̃2
(i), . . . , X̃t
(i))
Normalize the importance weights.
4. Selection step:
Resample with replacement N particles (X0(i), . . . ,Xt
(i), i = 1, . . . , N)
from the set (X̃0
(i), . . . , X̃t
(i), i = 1, . . . , N) according to the importance weights.
5. Set t← t + 1 and go to step 3.
Figure 8: Particle Filter Algorithm
23
References
F. J. Anscombe. The transformation of poisson, binomial and negative-binomial data. Biometrika,
35:246–254, 1948.
J. Ashe and A. P. Georgopoulos. Movement parameters and neural activity in motor cortex and
area 5. Cereb. Cortex, 6:590–600, 1994.
J. M. Black, E. Bienenstock, J. P. Donoghue, M. Serruya, W. Wu, and Y. Gao. Connecting
brains with machines: The neural control of 2d cursor movement. In Proceedings of the First
International IEEE/EMBS Conference on Neural Engineering, pages 580–583, 2003.
Anthony E Brockwell, Alex L Rojas, and Robert E Kass. Recursive bayesian decoding of cortical
signals by particle filter. J. Neurophysiol., 91:1899–1907, 2004.
E. N. Brown, R. Barbieri, V. Ventura, R. E. Kass, and L. M. Frank. The time-rescaling theorem
and its application to neural spike train data analysis. Neural Computation, 14:325–346, 2001.
E. N. Brown, L. M. Frank, D. Tang, M. C. Quirk, and M. A. Wilson. A statistical paradigm for
neural spike train decoding applied to position prediction from ensemble firing patterns of rat
hippocampal place cells. Neuroscience, 18:7411–7425, 1998.
George Casella and Roger Berger. Statistical Inference. Duxbury, 2nd edition, 2002.
W. S. Cleveland and S. J. Devlin. Locally-weighted regression: an approach to regression analysis
by local fitting. Journal of the American Statistical Association, 83:596–610, 1988.
A. Doucet, M. de Freitas, and N. J. Gordon, editors. Sequential Monte Carlo Methods in Practice.
Springer Verlag, New York, 2001.
Y. Gao, J. M. Black, E. Bienenstock, S. Shoham, and J. P. Donoghue. Probabilistic inference of
arm motion from neural activity in motor cortex. Advances in Neural Information Processing
Systems 14,The MIT Press, 2002.
Y. Gao, J. M. Black, E. Bienenstock, W. Wu, and J. P. Donoghue. A quantitative comparison of
linear and non-linear models of motor cortical activity for the encoding and decoding of arm mo-
24
tions. In Proceedings of the First International IEEE/EMBS Conference on Neural Engineering,
pages 189–192, 2003.
A. P. Georgopoulos, R. Caminiti, J. F. Kalaska, and J. T. Massey. Spatial coding movement: a
hypothesis concerning the coding of movement direction by cortical populations. Exp. Brain Res.
Suppl., 7:327–336, 1983.
T. Hastie and R. Tibshirani. Generalized Additive Models. Chapman and Hall, London, England,
1990.
A Johnson and S Kotz. Distribution in statistics: Continuous univariate distributions – 2. Wiley,
New York, 1970.
P. McCullagh and J. A. Nelder. Generalized Linear Models. Chapman and Hall, London, England,
1989.
D. W. Moran and A. B. Schwartz. Motor cortical representation of speed and direction. J. Neuro-
physiol., 82:2676–2692, 1999a.
D. W. Moran and A. B. Schwartz. Motor cortical representation of speed and direction. J. Neuro-
physiol., 82:2693–2704, 1999b.
Liam Paninski, Shy Shoham, Matthew R. Fellows, Nicholas G. Hatsopoulos, and John P. Donoghue.
Superlinear population encoding of dynamic hand trajectory signals in primary motor cortex.
Journal of Neuroscience, 560(3):883–896, 2004.
G.A. Reina and A.B. Schwartz. Eye-hand coupling during closed-loop drawing: evidence of shared
motor planned? Hum. Move. Sci., 22:137–152, 2003.
E. Salinas and L. F. Abbott. Vector reconstruction from firing rates. J. Compu. Neuro., 1:89–107,
1994.
S. Shoham, L. M. Paninski, M. R. Fellows, N. G. Hatsopoulos, J. P. Donoghue, and R. A. Normann.
Optimal decoding for a primary motor cortical brain-computer interface: I. statistical encoding
models for mi neurons. IEEE, 2003.
25
B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall, London,
1986.
D. M. Taylor, S. I. Helms Tillery, and A. B. Schwartz. Direct cortical control of 3d neuroprosthetic
devices. Science, 296:1829–1832, 2002.
Howard M Taylor and Samuel Karlin. An Introduction To Stochastic Modeling. Academic Press,
New York, 3rd edition, 1998.
W. Wu, J. M. Black, Y. Gao, E. Bienenstock, M. Serruya, and J. P. Donoghue. Inferring hand
motion from multi-cell recordings in motor cortex using a kalman filter. Proceedings of the
SAB’02-Workshop on Motor Control in Humans and Robots: On the Interplay of Real Brains
and Artificial Devices, pages 66–73, 2002.
W. Wu, J. M. Black, Y. Gao, E. Bienenstock, M. Serruya, A. Shaikaouni, and J. P. Donoghue.
Neural decoding of cursor motion using a kalman filter. In Advances in Neural Information