Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014 Distributed simulation of Polychronous and plastic Spiking Neural Networks: experiments with GPUs Francesco Simula (INFN) for the APE Lab: FP7 FET PROJECT GRANT N. 247846 2010-2014 www.euretile.eu
18
Embed
Distributed simulation of Polychronous and plastic …...platforms (see arXiv:1310.8478, P. S. Paolucci et al.). A brain simulation benchmark has 3 points of interest: As a source
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Distributed simulation of Polychronous and plastic
Spiking Neural Networks: experiments with GPUs
Francesco Simula (INFN) for the APE Lab:
FP7 FET PROJECT GRANT N. 247846 2010-2014
www.euretile.eu
Large scale modeling of
neuro-synaptic activity and plasticity
■ The DPSNN-STDP is a Distributed simulator of
Polychronous Spiking Neural Nets with synaptic Spike-
Timing-Dependent Plasticity, an efficient C++ plus MPI
code developed by the APE lab of INFN to be used as
benchmark for development of specialized HW/SW
platforms (see arXiv:1310.8478, P. S. Paolucci et al.).
■ A brain simulation benchmark has 3 points of interest:
As a source of requirements and architectural inspiration
towards extreme parallelism
As a parallel/distributed coding challenge
As a scientific grand challenge
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Where to start?
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Paradigm
atic n
euro
n
Models of neural activity
at spiking abstraction level
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Izhikevic model is:
■ computationally light:13 FLOPs / ms-neuron (physiologically accurate model needs ~1200 FLOPs / ms-neuron)
■ universal: same eq. for all known types of cortical neurons
■ has a rich dynamics: able to capture all 20 known neuron spiking patterns
Neural Spiking Model: the Izhikevich neuron
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Summary of the neurocomputational properties of biological spiking neurons. The same model (Izhikevich – 2003) with different values of parameters a, b, c and d is able to reproduce the behaviour of several types of cortical neurons. Each horizontal bar corresponds to 20 ms.
v(t) is the neural membrane potential; this is the key observable! – when v reaches vpeak, a neural spike is produced →
I(t) is the potential change generated by the sum of the currents from all synapses incoming to the neuron. It is a ‘forcing function’: incoming currents are present if spikes arrived form pre-synaptic neurons.
u(t) is an auxiliary variable (the recovery current bringing back v to equilibrium);
The dynamical variables of the single neuron are v(t) and u(t):
→ when a neuron spikes, all its M outgoing synapses add a current Wi to neurons they are connected to, with a set of different delays ti (polychronicity).
t= t0
A
B
D
C I(t0 + t1)=... +W1+...
I(t0 + tM)=... +WM+...
I(t0 + t2)=... +W2+...
1 2
M
W1
W2 WM
Synaptic dynamics: Spike Timing-Dependent
Plasticity (STDP)
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Capturing Timing Dependent Causal/anti-Causal relationship between couples of neurons: Causal potentiation: the synapse is maximally potentiated if its signal arrives to the target just before the post-synaptic spike; Anticausal depression: the weight is maximally depressed if the signal arrives just late.
S. Song et al., Nature Neuroscience 3 (2000)
In DPSNN-STDP, all synaptic weight variations are accumulated over 1000 steps (timestep 1ms), then applied to the W’s (long term plasticity).
Distribution of Cortical Fields and
Cortical Modules among Software Processes
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Spiking Activity and Synaptic Plasticity
(from 100K to 6.6 Giga synapses,
from 1 to 128 software processes)
■ The picture represents the evolution of a neural network computed by the DPSNN-STDP code
■ In the picture:
200 inhibitory neurons
800 excitatory neurons
total 100 000 synapses
Time resolution: 1ms (horizontal axis)
Each dot in the raster gram represents an individual spike
The evolution of the membrane potential of each neuron is simulated
The evolution of individual synaptic strength is computed (not shown in the picture)
Polychronism: individual synaptic delays are taken into account
Individual connections and neural types can be programmed
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Emergent Biological Behaviour:
Spontaneous Evolution of Rythmic Activity
due to Polychronism and Synaptic Plasticity
■ As synaptic weights evolve according to STDP (synaptic spike-timing dependent
plasticity, initial delta frequency oscillations (2-4Hz @ first second activity)
dissolves for a while into uncorrelated Poissonian activity (activity @ 100s) and
then gamma frequency activity emerges (30-100Hz @ 3600s)
Delta rhythm @ first second uncorrelated @ 100s Gamma rhythm after 3600s
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
DPSNN-STDP: MPI version - Strong and
Weak Scaling
■ Strong scaling. From 1 to 128 cores @ 2.4 GHz simulate various total network sizes (from 51Msyn to 6.6Gsyn). Exec times normalized to synapse count.
■ Weak scaling for various local
network sizes. Exec time
normalized to synapse count.
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
From Program Flow and Profiling …
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Function of the block Relative
execution time Note
Long term potentiation + after spike
dynamic (9.7 ±0.7)%
Gather +
computation
Barrier (optional) (29.9±6.1)% Workload
fluctuations
Communication: inter-process multicast:
Spikes dim (0.77±0.10)% Message passing
Communication: inter-process multicast:
Spikes payload (0.82±0.20)% Message passing
Axonal to synaptic spikes: intra-process
multicast (16.8±2.3)% Dereferencing
Add synaptic currents + long term
depression (19.2 ±2.7)% Computation
Thalamic input 0.01% Simplified model
Ordinary neural dynamics (11.8±1.4)% Computation
Rastergram & other statistical functions (1.9±0.1)% Computation
Long term synaptic plasticity (9.2±1.8)% Computation
These 2 functions have: - regular memory access patterns - significant amounts of FP computing ... how do they behave on the GPU?
GPU Environment
■ Trials were performed onto:
Intel Xeon CPUs:
• E5620 2.40GHz (Westmere)
• E5-2630 v2 2.60GHz (Ivy Bridge)
NVIDIA GPUs:
• S2050 (Fermi-class/sm_2.0, PCIe Gen2 int.)
• K20Xm (Kepler-class/sm_3.5, PCIe Gen2 int.)
• K40m (Kepler-class/sm_3.5, PCIe Gen3 int.)
• CUDA 6.5
Using the CUDA Thrust template library
version 1.8 on GitHub (with support for CUDA streams)
• Convert CPU arrays to Thrust device_vectors (with caveats!)
• Convert CPU functions to Thrust functors... and you are done!
Perspectives of GPU computing in Physics and Astrophysics – 15 – 17 Sep 2014
Long Term Plasticity on GPUs
Example:
■ Connectivity: M(synapses per neuron) = 100
■ Using 131072 neu → 13107200 syn (24b/syn) → 300Mb