首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Modular inverse reinforcement learning for visuomotor behavior
Authors:Constantin A Rothkopf  Dana H Ballard
Institution:1. Frankfurt Institute for Advanced Studies, Goethe University, 60438, Frankfurt, Germany
2. Institute of Cognitive Science, University Osnabrück, 49076, Osnabrück, Germany
3. Technical University Darmstadt, 64283, Darmstadt, Germany
4. Department for Computer Science, University of Texas at Austin, Austin, TX, 78712, USA
Abstract:In a large variety of situations one would like to have an expressive and accurate model of observed animal or human behavior. While general purpose mathematical models may capture successfully properties of observed behavior, it is desirable to root models in biological facts. Because of ample empirical evidence for reward-based learning in visuomotor tasks, we use a computational model based on the assumption that the observed agent is balancing the costs and benefits of its behavior to meet its goals. This leads to using the framework of reinforcement learning, which additionally provides well-established algorithms for learning of visuomotor task solutions. To quantify the agent’s goals as rewards implicit in the observed behavior, we propose to use inverse reinforcement learning, which quantifies the agent’s goals as rewards implicit in the observed behavior. Based on the assumption of a modular cognitive architecture, we introduce a modular inverse reinforcement learning algorithm that estimates the relative reward contributions of the component tasks in navigation, consisting of following a path while avoiding obstacles and approaching targets. It is shown how to recover the component reward weights for individual tasks and that variability in observed trajectories can be explained succinctly through behavioral goals. It is demonstrated through simulations that good estimates can be obtained already with modest amounts of observation data, which in turn allows the prediction of behavior in novel configurations.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号