Top Banner
Parallel FPGA Particle Filtering for Real-Time Neural Signal Processing John Mountney Co-advisors: Iyad Obeid and Dennis Silage
71
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Parallel FPGA Particle Filtering for Real-Time Neural Signal Processing

John MountneyCo-advisors: Iyad Obeid and Dennis Silage

Page 2: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Outline

• Introduction to Brain Machine Interfaces• Decoding Algorithms• Evaluation of the Bayesian Auxiliary Particle

Filter• Algorithm Implementation in Hardware• Proposed Future Work

Page 3: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Brain Machine Interface (BMI)

A BMI is a device which directly interacts with ensembles of neurons in the central nervous system

Page 4: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Applications of the BMI

Gain knowledge of the operation and functionality of the brain

Decode neural activity to estimate intended biological signals (neuroprosthetics)

Encode signals which can be interpreted by the brain (cochlear, retinal implants)

Page 5: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Interpreting Neural Activity

The neural tuning model is the key component to encoding and decoding biological signals

Given the current state x(t) of a neuron, the model describes its firing behavior in response to a stimulus

Page 6: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Tuning Function Example

2

2

2

))((

)( ξ

μts

et

Place cells fire when an animal is in a specific location and are responsible for spatial mapping.

Assumed firing model:

eMaximum firing rate:

Center of the receptive field:

Width of the receptive field:

-30 -20 -10 0 10 20 30 40 500

5

10

15

20

25

30

35

Position (cm)

Firi

ng R

ate

(Hz)

Tuning function for a single place cell

Page 7: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Neural Plasticity

Neural plasticity can be the result of environmental changes, learning, acting or brain injury

Based on how active a neuron is during an experience, the synapses grow stronger or weaker

Plasticity results in a dynamic state vector of the neural tuning model

Page 8: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Time-varying Tuning Function

2

2

)(2

))()(()(

)( tξ

tμtst

et

Dynamic firing model:

Dynamic state vector:

-20 0 20 40 60 80 1000

5

10

15

20

25

30

35

40

45

Position (cm)

Firi

ng R

ate

(Hz)

Dynamic tuning function for a single place cell

tuning function at time t0

tuning function at time t1

tuning function at time t2

)(

)(

)(

)(

t

t

t

t

x

Page 9: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Decoding Algorithms

Page 10: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Wiener Filter

• Linear transversal filter• Coefficients minimize the error between filter

output and a desired response• Applied in recreating center out reaching tasks

and 2D cursor movements (Gao, 2002)• Assumes the input signal is stationary and also

has an invertible autocorrelation matrix

Page 11: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Least Mean Square (LMS)

• Iterative algorithm that converges to the Weiner solution

• Avoids inverting the input autocorrelation matrix to provide computational savings

• If the autocorrelation matrix is ill conditioned, a large number of iterations may be required for convergence

Page 12: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Kalman Filter

• Solves the same problem as the Wiener filter without the constraint of stationarity

• Recursively updates the state estimate using current observations

• Applied in arm movement reconstruction experiments (Wu, 2002)

• Assumes all noise processes have a known Gaussian distribution

Page 13: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Extended Kalman Filter

• Attempts to linearize the model around the current state through a first-order Taylor expansion

• Successfully implemented in the control and tracking of spatiotemporal cortical activity (Schiff, 2008)

• State transition and measurement matrices must be differentiable

• Requires evaluation of Jacobians at each iteration

Page 14: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Unscented Kalman Filter

• The probability density is approximated by transforming a set of sigma points through the nonlinear prediction and update functions

• Easier to approximate a probability distribution than it is to approximate an arbitrary nonlinear transformation

• Recently applied in real-time closed loop BMI experiments (Li, 2009)

Page 15: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Unscented Kalman Filter (cont.)

• Statistical properties of the transformed sigma points become distorted through the linearization process

• If the initial state estimates are incorrect, filter divergence can quickly become an issue

• Gaussian environment is still assumed

Page 16: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Particle Filtering

• Numerical solution to nonlinear non-Gaussian state-space estimation

• Use Monte Carlo integration to approximate analytically intractable integrals

• Represent the posterior density by a set of randomly chosen weighted samples or particles

• Based on current observations, how likely does a particle represent the posterior

Page 17: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Resampling

• Replicate particles with high weights, discard particles with small weights

• Higher weighted particles are more likely to approximate the posterior with better accuracy

• Known as the sampling importance resampling (SIR) particle filter (Gordon, 1993)

Page 18: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

SIR Particle Filtering Algorithm

• Sample each particle from a proposal density π that approximates the current posterior:

• Assign particle weights based on how probable a sample drawn from the target posterior has been:

))(),1(|)((~)( tttt rr Nxxx

))(),1(|)((

))1(|)(())(|)(()1()(

ttt

ttpttptwtw

rr

rrrrr

Nxx

xxxN

Page 19: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

SIR Particle Filtering Algorithm

• Normalize the particle weights:

• Perform Resampling

• Re-initialize weights:

P

n

n

rr

tw

twtw

1

)(

)()(

PrP

wr ,,11

Page 20: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

SIR Particle Filtering Algorithm

• Form an estimate of the state as a weighted sum

• Repeat

P

r

rk

rkk w

1

x

Page 21: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

SIR Particle Filtering

• Applied to reconstruct hand movement trajectories (Eden, 2004)

• SIR particle filters suffer from degeneracy– Particles with high weights are duplicated many

times– May collapse to a single point (loss of diversity)

• Computationally expensive

Page 22: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Bayesian Auxiliary Particle Filter(BAPF)

Addresses two limitations of the SIR particle filter1. Poor outlier performance2. Degeneracy

Introduced by Pitt & Shephard (1999), later extended by Liu & West (2002) to include a smoothing factor

Page 23: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

BAPF

• Favor particles that are likely to survive at the next iteration of the algorithm

• Perform resampling at time tk-1 using the available measurements at time tk

• Use a two-stage weighting process to compensate for the predicted point and the actual sample

Page 24: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

BAPF Algorithm

• Sample each particle from a proposal density π that approximates the current posterior:

• Assign 1st stage weights g(t) based on how probable a sample drawn from the target posterior has been:

))(),1(|)((~)(ˆ tttt rr Nxxx

))(),1(ˆ|)(ˆ(

))1(ˆ|)(ˆ())(ˆ|)(()1()(

ttt

ttpttptwtg

rr

rrrrr

Nxx

xxxN

Page 25: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

BAPF Algorithm

• Normalize the importance weights

• Resample according to g(t)

• Sample each particle from a second proposal density q

))(),1(ˆ|)(ˆ(~)( tttqt rr Nxxx

Page 26: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

BAPF Algorithm

• Assign the 2nd stage weights

• Compute an estimate as a weighted sum

• Repeat

))(ˆ|)((

))(|)(()(

ttp

ttptw

r

rr

xN

xN

P

r

rk

rkk w

1

x

Page 27: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Evaluation of the Bayesian Auxiliary Particle Filter

Page 28: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Gaussian Shaped Tuning Function

-30 -20 -10 0 10 20 30 40 500

5

10

15

20

25

30

35

Position (cm)

Firi

ng R

ate

(Hz)

Tuning function for a single place cell

2

2

)(2

))()(()(

)( tξ

tμtst

jj

jj

et

Kj ,,1

Page 29: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Simulation ResultsPreliminary Data

• Observe an ensemble of hippocampal place cells whose firing times have an inhomogeneous Poisson arrival rate

• Estimate the animal’s position on a one dimensional 300 cm track, generated as random walk

• Evaluated under noisy conditions• Performance is compared to the Wiener filter and

sampling importance resampling particle filter

ttj )(

Page 30: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Mean Square Error vs.Number of Neurons

10 20 30 40 50 60 70 80 90 10010

1

102

103

104

105

106

number of neurons

MS

E

BAPF

PFSIR

WF

Page 31: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Signal Estimation•100 particles •100 neurons

Page 32: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

95% Confidence Intervals

Black: true positionRed: BAPF intervalGreen: PF interval

• 100 particles• 50 neurons• 100 simulations of a single data

set

Page 33: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Mean Square Error vs.Missed Firings

0 5 10 15 20 25 30 35 40 45 5010

2

103

104

Percentage of missed firings

MS

E

BAPF

PFSIR

•100 particles •50 neurons

Page 34: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Mean Square Error vs.Rate of False Detections

0 5 10 15 20 25 30 35 40 45 5010

2

103

104

Rate of false alarms

MS

E

BAPF

PFSIR

•100 particles •50 neurons

Page 35: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Mean Square Error vs.Spike Sorting Error

0 5 10 15 20 25 30 35 40 45 5010

2

103

104

Spike sorting error rate

MS

E

BAPF

PF

•100 particles •50 neurons

Page 36: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Algorithm Implementationin Hardware

Page 37: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Algorithm Implementation

• The target hardware is a field programmable gate array (FPGA)

• Dedicated hardware avoids fetching and decoding of instructions

• FPGAs are capable of executing multiple computations simultaneously

Page 38: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

FPGA Resources

• Configurable logic blocks (CLB)– Look-up tables (LUT)– Multiplexers– Flip-flops– Logic gates (AND, OR, NOT)

• Programmable interconnects– Routing matrix controls signal routing

• Input-Output cells– Latch data at the I/O pins

Page 39: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

FPGA Resources

• Embedded fixed-point multipliers (DSP48E)– 24-bit x 18-bit inputs

• On-chip memory– Up to 32 MB

• Digital clock managers– Multirate signal processing– Phase locked loops

Page 40: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

ML506SX50TResource Available

Slices 8160

Embedded Multipliers

288

RAM 5 MB

3.8 Gb/s Transceivers

12

I/O Pins 480

Maximum Clock Rate

550 MHz

Page 41: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Design Flow

1.

2.

3.

4.

Page 42: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Hardware Co-Simulation

Page 43: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Top-Level Block Diagram

Page 44: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Top-Level Block Diagram

Page 45: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Box-Muller Transformation

Generates two orthogonal standard normal sequences from two uniform distributions

sin1,0

cos1,0

ln2

2

2

1

2

1

RN

RN

URlet

Ulet

Page 46: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Box-Muller Transformation

Page 47: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Box-Muller Transformation

Page 48: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Linear Feedback Shift Register (LFSR)

1

0 2

1m

nn

nxr

• Shift register made of m flip-flops• Mod-2 adders configured according to a

generator polynomial• Represent a value between 0 and 1:

1)( 134 xxxxg

Page 49: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

LFSR (cont.)

• LFSR output has correlation• Bits are only shifted one position• Has a lowpass effect on the output sequence

Page 50: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Linear Feedback Shift Register with Skip-ahead Logic

• Advances the state of the LFSR multiple states• Bits are shifted multiple positions• Removes correlation in the uniform distribution

Page 51: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Box-Muller Transformation

12 U

2ln2 UR

Page 52: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Top-Level Block Diagram

Page 53: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Top-Level Block Diagram

Page 54: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Particle Block Diagram

Page 55: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Steps 1 and 2 of the BAPF Algorithm

rrr

r

txtx

N

,0~ 1

)2(

)1(

Page 56: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Particle Block Diagram

rrr

r

txtx

N

,0~ 1

)2(

)1(

)(tg r)3(

))(ˆ|)(()1( ttptw rr xN

Page 57: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Compute the 1st Stage Weights

ttN

Btjj

rrrr jj ettwttptwtg

)(

,

)1())(ˆ|)(()1()( xN

2

2

2

)()(

tts

j

j

e

2

2

2

)()(

tts j

Page 58: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Compute the 1st Stage Weights

ttN

Btjj

rrrr jj ettwttNptwtg

)(

,

)1())(ˆ|)(()1()( x

vwx eee integer fraction

||11 vwx eee For x<0:

Page 59: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Compute the 1st Stage Weights

ttN

Btjj

rrrr jj ettwttNptwtg

)(

,

)1())(ˆ|)(()1()( x

Page 60: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Resample the 1st Stage Weights

Page 61: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Particle Block Diagram

rrr

r

txtx

N

,0~ 1

)2(

)1(

rrr

r

txtx

N

,0~ 2

)5(

)4(

)(tg r)3(

))(ˆ|)(()1( ttNptw rr x

)6())(ˆ|)((

))(|)(()(

ttNp

ttNptw

r

rr

x

x

Page 62: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Estimated Output Signal as a Weighted Sum

P

r

rr ttwt1

)()()( x

Page 63: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Synthesis Results

Slices DSP48Es Clock-cycles Latency

Random Number Generator

3506 0 1(after pipelining)

3.7 ns

Exponential 55 1 5 1.4 ns

Exponential Quantity

12 2 3 3.0 ns

Raise to Integer Power

51 3 4 per sample 1.6 ns

Page 64: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Proposed Future Work

Page 65: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Parallel Resampling

• Particles with high weights are retained• Particles with low weights are discarded

• All particles can be resampled in two clock cycles

• On the first cycle, all particles are copied to temporary registers

• On the second cycle, all particles are compared and assigned new values

Page 66: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Automated Controller

• Design as a finite state machine (FSM)• Sampling period, block size, number of neurons

and number of particles determine control signals

• Signals include: enable lines for data registers, multipliers and counters, select lines for multiplexers and reset signals

• Build the FSM from counters, comparators and multiplexers

Page 67: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Verification

• Filter output compared to the MATLAB simulations

• Quantization error is expected• Determine the number of bits needed for

acceptable precision of the estimated signal• Further evaluation of the filter with an

increase in particles and neurons

Page 68: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Throughput Comparison

• The parallel processing architecture will be compared to a sequential implementation

• Current benchmark is MATLAB running on the Java Virtual Machine (not a true comparison)

• Comparisons will be made for throughput as a function of particles as well as neurons

Page 69: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

TimelineThroughput Comparison

Verification

Evaluation of the number of particles/neurons

Synthesize Controller

Simulate Controller

Synthesize Modules

May June July Aug Sept Oct Nov Dec

Page 70: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Acknowledgements

Thank you, advisors and committee members.

• Dr. Iyad Obeid• Dr. Dennis Silage• Dr. Joseph Picone• Dr. Marc Sobel

Page 71: John Mountney Co-advisors: Iyad Obeid and Dennis Silage.

Questions?