==================================================================================================
Robert Legenstein, Dejan Pecevski, Wolfgang Maass (2008)
"A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback"

--------------------------------------------------------------------------------------------------
Hence it also provides a possible functional explanation for trial-to-trial variability, which is characteristic for cortical networks of neurons but has no analogue in currently existing artificial computing systems.

Numerous experimental studies (see [1] for a review; [2] discusses more recent in-vivo results) have shown that the efficacy of synapses changes in dependence of the time difference $\delta t = t_{post} - t_{pre}$ between the firing times $t_{pre}$ and $t_{post}$ of the pre- and postsynaptic neurons. This effect is called spike-timing-dependent plasticity (STDP).

     [1] Abbott L. F. and Nelson S.B. (2000) "Synaptic plasticity: taming the beast." Nat. Neurosci. 3: 1178-1183.


Corresponding spike-based rules for synaptic plasticity of the form $\frac{d}{dt} w_{ji} = c_{ji}d(t)$ have been proposed in [12] and [13] where $w_{ij}$ is the weight of a synapse from neuron i to neuron j, $c_{ji}(t)$ is an eligibility trace of this synapse which collects weight changes proposed by STDP, and $d(t) = h(t) - h$ results from a neuromodulatory signal h(t) with mean $h$.

It was shown in [12] that a number of interesting learning tasks in large networks of neurons can be accomplished with this simple rule in Equation 1.

     [12] Izhikevich E.M. (2007) "Solving the distal reward problem through linkage of STDP and dopamine signaling." Cereb. Cortex. 17: 2443--2452.

or if one maximizes the likelihood of postsynaptic firing at desired firing times [16].

     [13] Florian R. V. (2007) "Reinforcement learning through modulation of spike-timing dependent synaptic plasticity." Neural. Comput. 6: 1468--1502.

     [14] Baxter J, Bartlett PL (1999) Direct gradient-based reinforcement learning: I.
Gradient estimation algorithms. Technical report. Research School of
Information Sciences and Engineering, Australian National University.

     [15] Baras D, Meir R (2007) "Reinforcement learning, spike-time-dependent plasticity, and the bcm rule." Neural. Comput. 19: 2245.2279.

     [16] Pfister J. P., Toyoizumi T., Barber D.. and Gerstner W. (2006) "Optimal spike-timing dependent plasticity for precise action potential firing in supervised learning." Neural. Comput. 18: 1318.1348.





==================================================================================================
[1] L. F. Abbott and Sacha B. Nelson (2000) "Synaptic plasticity: taming the beast"

Recent experimental results suggest several novel mechanisms for regulating levels of activity in conjunction with Hebbian synaptic modification. We review three of them.synaptic scaling, spiketiming dependent plasticity and synaptic redistribution, and discuss their functional implications.



==================================================================================================
Eugene M. Izhikevich
Solving the Distal Reward Problem through
Linkage of STDP and Dopamine Signaling


to read
Dorit Baras and Ron Meir
Reinforcement Learning, Spike Time Dependent Plasticity and the BCM Rule

==================================================================================================
LECTURE 5
Spike timing.dependent plasticity (STDP)

In many regions of the brain, neurons are found to exhibit bidirectional plasticity
in which the strength of a synapse can increase or decrease depending on the
stimulus protocol [109, 110, 111, 112, 113]. Long term potentiation (LTP) is
a persistent increase in synaptic efficacy produced by high-frequency stimulation
of presynaptic afferents or by the pairing of low frequency presynaptic stimulation
with robust postsynaptic depolarization. Long.term synaptic depression (LTD)
is a long-lasting decrease in synaptic strength induced by low-frequency stimulation
of presynaptic afferents. More recent experimental studies suggest that both
the sign and degree of synaptic modification arising from repeated pairing of preand
postsynaptic action potentials depend on their relative timing [114, 115, 116].
Long-term strengthening of synapses occurs if presynaptic action potentials precede
postsynaptic firing by no more than about 50 ms. Presynaptic action potentials that
follow postsynaptic spikes produce long-term weakening of synapses. The largest
changes in synaptic efficacy occur when the time difference between pre- and postsynaptic
action potentials is small, and there is a sharp transition from strengthening
to weakening. This phenomenon of spike timing.dependent plasticity (STDP) is
illustrated in figure 1.


==================================================================================================
The Spike Response Model ... Wulfram Gerstner

A description of neuronal activity on the level of ion channels as in the Hodgkin Huxley model leads to a set of coupled nonlinear differential equations which are diffcult to analyze. ...
-> 
a reduction of the nonlinear spike dynamics to a threshold process. ...

Spikes occur if the membrane potential $u(t)$ reaches a threshold $\theta$. The voltage response to spike input is described by the postsynaptic potential $\epsilon$. Postsynaptic potentials of several input spikes are added linearly until $u$ reaches $\theta$. The output pulse itself and the reset/refractory period which follow the pulse are described by a function $\eta$. Since $\epsilon$ and $\eta$ can be interpreted as response kernels, the resulting model is called the Spike Response Model (SRM).

(1) Hodgkin-Huxley dynamics with time-dependent input can be reproduced to a high degree of accuracy by the SRM.

(2) The simple integrate-and-fire neuron is a special case of the Spike Response
Model




On-the-complexity-of-learning-for-a-spiking-neuron.

Polychronization Computation-with-spikes

Evolutionary-supervision-of-a-dynamical-neural-network-allows-learning-with-on-going-weights.