Performance-gated deliberation: A context-adapted strategy in which urgency is opportunity cost期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Performance-gated deliberation: A context-adapted strategy in which urgency is opportunity cost

Authors:	Maximilian Puelma Touzel Paul Cisek Guillaume Lajoie

Institution:	1. Mila, Québec AI Institute, Montréal, Canada ; 2. Department of Computer Science & Operations Research, Université de Montréal, Montréal, Canada ; 3. Department of Neuroscience, Université de Montréal, Montréal, Canada ; 4. Department of Mathematics & Statistics, Université de Montréal, Montréal, Canada ; Ecole Normale Superieure, FRANCE

Abstract:	Finding the right amount of deliberation, between insufficient and excessive, is a hard decision making problem that depends on the value we place on our time. Average-reward, putatively encoded by tonic dopamine, serves in existing reinforcement learning theory as the opportunity cost of time, including deliberation time. Importantly, this cost can itself vary with the environmental context and is not trivial to estimate. Here, we propose how the opportunity cost of deliberation can be estimated adaptively on multiple timescales to account for non-stationary contextual factors. We use it in a simple decision-making heuristic based on average-reward reinforcement learning (AR-RL) that we call Performance-Gated Deliberation (PGD). We propose PGD as a strategy used by animals wherein deliberation cost is implemented directly as urgency, a previously characterized neural signal effectively controlling the speed of the decision-making process. We show PGD outperforms AR-RL solutions in explaining behaviour and urgency of non-human primates in a context-varying random walk prediction task and is consistent with relative performance and urgency in a context-varying random dot motion task. We make readily testable predictions for both neural activity and behaviour.

Keywords:

设为首页 | 免责声明 | 关于勤云 | 加入收藏