Who started the method Reinforcement learning theory (Sutton and Barto, 1998) Sutton RS, Precup D, Singh S. 1999. Between MDPs and semi- MDPs: A framework for temporal abstraction in reinforcement learning. Artif Intell 112:181.211. Broekens J. and DeGroot D. (2004). Emergent representations and reasoning in adaptive agents. In Proc. 3rd International Conference on Machine Learning and Applications (pp. 207-214). ================================================================================================== For Further Challange "A Neural Reinforcement Learning System." Report from Studies of Artificial Neural Systems, royal Institute of Technology, Sweden. Zilli, E.~A., ,and M.~E. Hasselmo (2008) "Modeling the Role of Working Memory and Episodic Memory in Behavioral Tasks." Hippocampus, Vol. 18(2), pp193-209. Johansson, C., P.~Raicevic, and A.~Lansner (2003) "Reinforcement Learning Based on a Bayesian Confidence Propagating Neural Network." SAIS-SSLS Joint Workshop, Center for Applied Autonomous Sensor Systems, Orebro, Sweden. J.~Broekens (2005) "Internal Simulation of Behavior has an Adaptive Advantage." Proceedings of the Cognitive Science Conference, pp. 342-347. -------------------------------------------------------------------------------------------------- Sebastian R. and K.~O.~Stanley (2010) "Indirectly Encoding Neural Plasticity as a Pattern of Local Rules." Proceedings of the International Conference on Simulation of Adaptive Behavior (From Animals to Animats 11 ). Springer Lecture Notes in Computer Science, Vol. 6226, pp. 533-543. Blynel, J., Floreano, D. (2003) "Exploring the T-Maze: Evolving Learning-Like Robot Behaviors using CTRNNs." Proceedings of the European Workshop on Evolutionary Robotics, pp. 593-604. Soltoggio, A., J.~A.~Bullinaria, C.~Mattiussi, P.~Durr, and D.~Floreano (2008) "Evolutionary Advantages of Neuromodulated Plasticity in Dynamic, Reward-based Scenarios." Proceedings of the International Conference on Artificial Life (Alife XI), pp. 569--576. Soltoggio, A. (2008) "Neural Plasticity and Minimal Topologies for Reward-Based Learning." Proceedings of the International Conference on Hybrid Intelligent Systems, pp. 637-642 Internal-Simulation-of-Behavior-has-an-Adaptive-Advantage Ziemke, T. and M.~Thieme (2002). "Neuromodulation of Reactive Sensorimotor Mappings as Short-Term Memory Mechanism in Delayed Response Tasks." Adaptive Behavior, Vol. 10, pp. 185-19no 9. acual rat Ramirez, A.~B., and A.~W.~Ridel (2006( "Bio-inspired Model of Robot Adaptive Learning and Mapping." Proceedings of Intrernational Conference on Intelligent robots and System. pp. 4750-4755. Florean Emergent-representations-and-reasoning-in-adaptive-agents.- Reinforcement-Learning-Based-on-a-Bayesian-Confidence-Propagating-Neural-Network Indirectly-Encoding-Neural-Plasticity-as-a-Pattern-of-Local-Rules Exploring-the-T-Maze:-Evolving-Learning-Like-Robot-Behaviors-using-CTRNNs. Evolutionary-Advantages-of-Neuromodulated-Plasticity-in-Dynamic,-Reward-based-Scenarios Neuromodulation-of-Reactive-Sensorimotor-Mappings-as-Short-Term-Memory-Mechanism-in-Delayed-Response-Tasks." Modeling-the-Role-of-Working-Memory-and-Episodic-Memory-in-Behavioral-Tasks." Reinforcement-Learning-Based-on-a-Bayesian-Confidence-Propagating-Neural-Network Internal-Simulation-of-Behavior-has-an-Adaptive-Advantage Bio-inspired-Model-of-Robot-Adaptive-Learning-and-Mapping