H. Paugam-Moisy, R. Martinez and S. Bengio (????) "A supervised learning approach based on STDP and polychronization in spiking neuron networks." "We propose a network model of spiking neurons, without preimposed topology and driven by STDP (Spike-Time-Dependent Plasticity), a temporal Hebbian unsupervised learning mode, biologically observed." "We use excitatory and inhibitory temporal windows as proposed in [7] and apply a multiplicative weight update." [7] D. Meunier and H. Paugam-Moisy. Evolutionary supervision of a dynamical neural network allows learning with on-going weights. In IJCNNf2005, Int. Joint Conf. on Neural Networks, pages 1493.1498. IEEE.INNS, 2005. ================================================================================================== Wiebke Potjans, Abigail Morrison, Markus Diesmann A Spiking Neural Network Model of an Actor-Critic Learning Agent "A third category of studies investigates aspects of TD learning in the context of spiking activity; however, they focus only on the prediction problem of when rewards can be expected and do not address the control problem of what actions to take (Rao & Sejnowski, 2001) or only simple control problems where a reward or penalty is awarded after every decision (Izhikevich, 2007; Farries&Fairhall, 2007)." - Izhikevich, E.M. (2007). Solving the distal reward problem through linkage of STDP and dopamine signaling. Cereb. Cortex, 17(10), 2443?2452. - Farries, M. A., & Fairhall, A. L. (2007). Reinforcement learning with modulated spike timing-dependent synaptic plasticity. J. Neurophysiol., 98, 3648?3665.