期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Explicit neural signals reflecting reward uncertainty

Schultz W Preuschoff K Camerer C Hsu M Fiorillo CD Tobler PN Bossaerts P 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2008,363(1511):3801-3811

The acknowledged importance of uncertainty in economic decision making has stimulated the search for neural signals that could influence learning and inform decision mechanisms. Current views distinguish two forms of uncertainty, namely risk and ambiguity, depending on whether the probability distributions of outcomes are known or unknown. Behavioural neurophysiological studies on dopamine neurons revealed a risk signal, which covaried with the standard deviation or variance of the magnitude of juice rewards and occurred separately from reward value coding. Human imaging studies identified similarly distinct risk signals for monetary rewards in the striatum and orbitofrontal cortex (OFC), thus fulfilling a requirement for the mean variance approach of economic decision theory. The orbitofrontal risk signal covaried with individual risk attitudes, possibly explaining individual differences in risk perception and risky decision making. Ambiguous gambles with incomplete probabilistic information induced stronger brain signals than risky gambles in OFC and amygdala, suggesting that the brain's reward system signals the partial lack of information. The brain can use the uncertainty signals to assess the uncertainty of rewards, influence learning, modulate the value of uncertain rewards and make appropriate behavioural choices between only partly known options. 相似文献

2.

Midbrain dopamine neurons encode a quantitative reward prediction error signal 总被引：15，自引：0，他引：15

Bayer HM Glimcher PW 《Neuron》2005,47(1):129-141

相似文献

3.

Dissociable reward and timing signals in human midbrain and ventral striatum

Klein-Flügge MC Hunt LT Bach DR Dolan RJ Behrens TE 《Neuron》2011,72(4):654-664

Reward prediction error (RPE) signals are central to current models of reward-learning. Temporal difference (TD) learning models posit that these signals should be modulated by predictions, not only of magnitude but also timing of reward. Here we show that BOLD activity in the VTA conforms to such TD predictions: responses to unexpected rewards are modulated by a temporal hazard function and activity between a predictive stimulus and reward is depressed in proportion to predicted reward. By contrast, BOLD activity in ventral striatum (VS) does not reflect a TD RPE, but instead encodes a signal on the variable relevant for behavior, here timing but not magnitude of reward. The results have important implications for dopaminergic models of cortico-striatal learning and suggest a modification of the conventional view that VS BOLD necessarily reflects inputs from dopaminergic VTA neurons signaling an RPE. 相似文献

4.

Stimulus-Response-Outcome Coding in the Pigeon Nidopallium Caudolaterale

Sarah Starosta Onur Güntürkün Maik C. Stüttgen 《PloS one》2013,8(2)

A prerequisite for adaptive goal-directed behavior is that animals constantly evaluate action outcomes and relate them to both their antecedent behavior and to stimuli predictive of reward or non-reward. Here, we investigate whether single neurons in the avian nidopallium caudolaterale (NCL), a multimodal associative forebrain structure and a presumed analogue of mammalian prefrontal cortex, represent information useful for goal-directed behavior. We subjected pigeons to a go-nogo task, in which responding to one visual stimulus (S+) was partially reinforced, responding to another stimulus (S–) was punished, and responding to test stimuli from the same physical dimension (spatial frequency) was inconsequential. The birds responded most intensely to S+, and their response rates decreased monotonically as stimuli became progressively dissimilar to S+; thereby, response rates provided a behavioral index of reward expectancy. We found that many NCL neurons'' responses were modulated in the stimulus discrimination phase, the outcome phase, or both. A substantial fraction of neurons increased firing for cues predicting non-reward or decreased firing for cues predicting reward. Interestingly, the same neurons also responded when reward was expected but not delivered, and could thus provide a negative reward prediction error or, alternatively, signal negative value. In addition, many cells showed motor-related response modulation. In summary, NCL neurons represent information about the reward value of specific stimuli, instrumental actions as well as action outcomes, and therefore provide signals useful for adaptive behavior in dynamically changing environments. 相似文献

5.

Saccade reward signals in posterior cingulate cortex 总被引：7，自引：0，他引：7

McCoy AN Crowley JC Haghighian G Dean HL Platt ML 《Neuron》2003,40(5):1031-1040

Movement selection depends on the outcome of prior behavior. Posterior cingulate cortex (CGp) is strongly connected with both limbic and oculomotor circuitry, and CGp neurons respond following saccades, suggesting a role in signaling the motivational outcome of gaze shifts. To test this hypothesis, single CGp neurons were studied in monkeys while they shifted gaze to visual targets for liquid rewards that varied in size or were delivered probabilistically. CGp neurons responded following saccades as well as following reward delivery, and these responses were correlated with reward size. CGp neurons also responded following the omission of predicted rewards. The timing of CGp activation and its modulation by reward could provide signals useful for updating representations of expected saccade value. 相似文献

6.

Multiple reward signals in the brain

Schultz W 《Nature reviews. Neuroscience》2000,1(3):199-207

The fundamental biological importance of rewards has created an increasing interest in the neuronal processing of reward information. The suggestion that the mechanisms underlying drug addiction might involve natural reward systems has also stimulated interest. This article focuses on recent neurophysiological studies in primates that have revealed that neurons in a limited number of brain structures carry specific signals about past and future rewards. This research provides the first step towards an understanding of how rewards influence behaviour before they are received and how the brain might use reward information to control learning and goal-directed behaviour. 相似文献

7.

The Role of Serotonin in the Regulation of Patience and Impulsivity

Miyazaki K Miyazaki KW Doya K 《Molecular neurobiology》2012,45(2):213-224

Classic theories suggest that central serotonergic neurons are involved in the behavioral inhibition that is associated with the prediction of negative rewards or punishment. Failed behavioral inhibition can cause impulsive behaviors. However, the behavioral inhibition that results from predicting punishment is not sufficient to explain some forms of impulsive behavior. In this article, we propose that the forebrain serotonergic system is involved in “waiting to avoid punishment” for future punishments and “waiting to obtain reward” for future rewards. Recently, we have found that serotonergic neurons increase their tonic firing rate when rats await food and water rewards and conditioned reinforcer tones. The rate of tonic firing during the delay period was significantly higher when rats were waiting for rewards than for tones, and rats were unable to wait as long for tones as for rewards. These results suggest that increased serotonergic neuronal firing facilitates waiting behavior when there is the prospect of a forthcoming reward and that serotonergic activation contributes to the patience that allows rats to wait longer. We propose a working hypothesis to explain how the serotonergic system regulates patience while waiting for future rewards. 相似文献

8.

Coding of reward risk by orbitofrontal neurons is mostly distinct from coding of reward value

O'Neill M Schultz W 《Neuron》2010,68(4):789-800

Risky decision-making is altered in humans and animals with damage to the orbitofrontal cortex. However, the cellular function of the intact orbitofrontal cortex in processing information relevant for risky decisions is unknown. We recorded responses of single orbitofrontal neurons while monkeys viewed visual cues representing the key decision parameters, reward risk and value. Risk was defined as the mathematical variance of binary symmetric probability distributions of reward magnitudes; value was defined as non-risky reward magnitude. Monkeys displayed graded behavioral preferences for risky outcomes, as they did for value. A population of orbitofrontal neurons showed a distinctive risk signal: their cues and reward responses covaried monotonically with the variance of the different reward distributions without monotonically coding reward value. Furthermore, a small but statistically significant fraction of risk responses also coded reward value. These risk signals may provide physiological correlates for the role of the orbitofrontal cortex in risk processing. 相似文献

9.

Expectations and outcomes: decision-making in the primate brain

Allison?N.?McCoy Michael?L.?Platt Email author 《Journal of comparative physiology. A, Neuroethology, sensory, neural, and behavioral physiology》2005,191(3):201-211

Success in a constantly changing environment requires that decision-making strategies be updated as reward contingencies change. How this is accomplished by the nervous system has, until recently, remained a profound mystery. New studies coupling economic theory with neurophysiological techniques have revealed the explicit representation of behavioral value. Specifically, when fluid reinforcement is paired with visually-guided eye movements, neurons in parietal cortex, prefrontal cortex, the basal ganglia, and superior colliculus—all nodes in a network linking visual stimulation with the generation of oculomotor behavior—encode the expected value of targets lying within their response fields. Other brain areas have been implicated in the processing of reward-related information in the abstract: midbrain dopaminergic neurons, for instance, signal an error in reward prediction. Still other brain areas link information about reward to the selection and performance of specific actions in order for behavior to adapt to changing environmental exigencies. Neurons in posterior cingulate cortex have been shown to carry signals related to both reward outcomes and oculomotor behavior, suggesting that they participate in updating estimates of orienting value. 相似文献

10.

Evolution of honest reward signal in flowers

Koichi Ito Miki F. Suzuki Ko Mochizuki 《Proceedings. Biological sciences / The Royal Society》2021,288(1943)

Some flowering plants signal the abundance of their rewards by changing their flower colour, scent or other floral traits as rewards are depleted. These floral trait changes can be regarded as honest signals of reward states for pollinators. Previous studies have hypothesized that these signals are used to maintain plant-level attractiveness to pollinators, but the evolutionary conditions leading to the development of honest signals have not been well investigated from a theoretical basis. We examined conditions leading to the evolution of honest reward signals in flowers by applying a theoretical model that included pollinator response and signal accuracy. We assumed that pollinators learn floral traits and plant locations in association with reward states and use this information to decide which flowers to visit. While manipulating the level of associative learning, we investigated optimal flower longevity, the proportion of reward and rewardless flowers, and honest- and dishonest-signalling strategies. We found that honest signals are evolutionarily stable only when flowers are visited by pollinators with both high and low learning abilities. These findings imply that behavioural variation in learning within a pollinator community can lead to the evolution of an honest signal even when there is no contribution of rewardless flowers to pollinator attractiveness. 相似文献

11.

Encoding of time-discounted rewards in orbitofrontal cortex is independent of value representation 总被引：3，自引：0，他引：3

Roesch MR Taylor AR Schoenbaum G 《Neuron》2006,51(4):509-520

We monitored single-neuron activity in the orbitofrontal cortex of rats performing a time-discounting task in which the spatial location of the reward predicted whether the delay preceding reward delivery would be short or long. We found that rewards delivered after a short delay elicited a stronger neuronal response than those delivered after a long delay in most neurons. Activity in these neurons was not influenced by reward size when delays were held constant. This was also true for a minority of neurons that exhibited sustained increases in firing in anticipation of delayed reward. Thus, encoding of time-discounted rewards in orbitofrontal cortex is independent of the encoding of absolute reward value. These results are contrary to the proposal that orbitofrontal neurons signal the value of delayed rewards in a common currency and instead suggest alternative proposals for the role this region plays in guiding responses for delayed versus immediate rewards. 相似文献

12.

Single units in the pigeon brain integrate reward amount and time-to-reward in an impulsive choice task 总被引：6，自引：0，他引：6

Kalenscher T Windmann S Diekamp B Rose J Güntürkün O Colombo M 《Current biology : CB》2005,15(7):594-602

BACKGROUND: Animals prefer small over large rewards when the delays preceding large rewards exceed an individual tolerance limit. Such impulsive choice behavior occurs even in situations in which alternative strategies would yield more optimal outcomes. Behavioral research has shown that an animal's choice is guided by the alternative rewards' subjective values, which are a function of reward amount and time-to-reward. Despite increasing knowledge about the pharmacology and anatomy underlying impulsivity, it is still unknown how the brain combines reward amount and time-to-reward information to represent subjective reward value. RESULTS: We trained pigeons to choose between small, immediate rewards and large rewards delivered after gradually increasing delays. Single-cell recordings in the avian Nidopallium caudolaterale, the presumed functional analog of the mammalian prefrontal cortex, revealed that neural delay activation decreased with increasing delay length but also covaried with the expected reward amount. This integrated neural response was modulated by reward amount and delay, as predicted by a hyperbolical equation, of subjective reward value derived from behavioral studies. Furthermore, the neural activation pattern reflected the current reward preference and the time point of the shift from large to small rewards. CONCLUSIONS: The reported activity was modulated by the temporal devaluation of the anticipated reward in addition to reward amount. Our findings contribute to the understanding of neuropathologies such as drug addiction, pathological gambling, frontal lobe syndrome, and attention-deficit disorders, which are characterized by inappropriate temporal discounting and increased impulsiveness. 相似文献

13.

Activation of VTA GABA neurons disrupts reward consumption

van Zessen R Phillips JL Budygin EA Stuber GD 《Neuron》2012,73(6):1184-1194

The activity of ventral tegmental area (VTA) dopamine (DA) neurons promotes behavioral responses to rewards and environmental stimuli that predict them. VTA GABA inputs synapse directly onto DA neurons and may regulate DA neuronal activity to alter reward-related behaviors; however, the functional consequences of selective activation of VTA GABA neurons remains unknown. Here, we show that in?vivo optogenetic activation of VTA GABA neurons disrupts reward consummatory behavior but not conditioned anticipatory behavior in response to reward-predictive cues. In addition, direct activation of VTA GABA projections to the nucleus accumbens (NAc) resulted in detectable GABA release but did not alter reward consumption. Furthermore, optogenetic stimulation of VTA GABA neurons directly suppressed the activity and excitability of neighboring DA neurons as well as the release of DA in the NAc, suggesting that the dynamic interplay between VTA DA and GABA neurons can control the initiation and termination of reward-related behaviors. 相似文献

14.

下丘脑食欲素在药物成瘾中的作用

Han J Li YH Bai YJ Sui N 《生理科学进展》2007,38(4):327-330

下丘脑是调控自然奖赏的重要脑区,它能特异性地表达一种神经肽——食欲素(orexin),这种神经肽在药物奖赏中的作用受到广泛关注。在成瘾研究中,发现不同脑区中的食欲素神经元对奖赏和动机行为的调节作用是不相同的:围穹窿区(PFA)和背内侧下丘脑区(DMH)的食欲素神经元主要参与激活应激系统,而外侧下丘脑(LH)的食欲素神经元主要通过激活与奖赏学习相关的大脑环路参与奖赏行为的调控。提示食欲素系统可在延长戒断防止复吸发生中成为新的研究目标,食欲素受体可以作为治疗药物成瘾的一种新的治疗靶标。相似文献

15.

Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology

Schultz W 《Current opinion in neurobiology》2004,14(2):139-147

Neurons in a small number of brain structures detect rewards and reward-predicting stimuli and are active during the expectation of predictable food and liquid rewards. These neurons code the reward information according to basic terms of various behavioural theories that seek to explain reward-directed learning, approach behaviour and decision-making. The involved brain structures include groups of dopamine neurons, the striatum including the nucleus accumbens, the orbitofrontal cortex and the amygdala. The reward information is fed to brain structures involved in decision-making and organisation of behaviour, such as the dorsolateral prefrontal cortex and possibly the parietal cortex. The neural coding of basic reward terms derived from formal theories puts the neurophysiological investigation of reward mechanisms on firm conceptual grounds and provides neural correlates for the function of rewards in learning, approach behaviour and decision-making. 相似文献

16.

Role of anticipated reward in cognitive behavioral control

Watanabe M 《Current opinion in neurobiology》2007,17(2):213-219

The lateral prefrontal cortex (LPFC), which is important for higher cognitive activity, is also concerned with motivational operations; this is exemplified by its activity in relation to expectancy of rewards. In the LPFC, motivational information is integrated with cognitive information, as demonstrated by the enhancement of working-memory-related activity by reward expectancy. Such activity would be expected to induce changes in attention and, subsequently, to modify behavioral performance. Recently, the effects of motivation and emotion on neural activities have been examined in several areas of the brain in relation to cognitive-task performance. Of these areas, the LPFC seems to have the most important role in adaptive goal-directed behavior, by sending top-down attention-control signals to other areas of the brain. 相似文献

17.

Models of heterogeneous dopamine signaling in an insect learning and memory center

Linnie Jiang Ashok Litwin-Kumar 《PLoS computational biology》2021,17(8)

The Drosophila mushroom body exhibits dopamine dependent synaptic plasticity that underlies the acquisition of associative memories. Recordings of dopamine neurons in this system have identified signals related to external reinforcement such as reward and punishment. However, other factors including locomotion, novelty, reward expectation, and internal state have also recently been shown to modulate dopamine neurons. This heterogeneity is at odds with typical modeling approaches in which these neurons are assumed to encode a global, scalar error signal. How is dopamine dependent plasticity coordinated in the presence of such heterogeneity? We develop a modeling approach that infers a pattern of dopamine activity sufficient to solve defined behavioral tasks, given architectural constraints informed by knowledge of mushroom body circuitry. Model dopamine neurons exhibit diverse tuning to task parameters while nonetheless producing coherent learned behaviors. Notably, reward prediction error emerges as a mode of population activity distributed across these neurons. Our results provide a mechanistic framework that accounts for the heterogeneity of dopamine activity during learning and behavior. 相似文献

18.

Timing in reward and decision processes

Maria A. Bermudez Wolfram Schultz 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2014,369(1637)

Sensitivity to time, including the time of reward, guides the behaviour of all organisms. Recent research suggests that all major reward structures of the brain process the time of reward occurrence, including midbrain dopamine neurons, striatum, frontal cortex and amygdala. Neuronal reward responses in dopamine neurons, striatum and frontal cortex show temporal discounting of reward value. The prediction error signal of dopamine neurons includes the predicted time of rewards. Neurons in the striatum, frontal cortex and amygdala show responses to reward delivery and activities anticipating rewards that are sensitive to the predicted time of reward and the instantaneous reward probability. Together these data suggest that internal timing processes have several well characterized effects on neuronal reward processing. 相似文献

19.

Dopamine in motivational control: rewarding, aversive, and alerting

Bromberg-Martin ES Matsumoto M Hikosaka O 《Neuron》2010,68(5):815-834

Midbrain dopamine neurons are well known for their strong responses to rewards and their critical role in?positive motivation. It has become increasingly clear, however, that dopamine neurons also transmit signals related to salient but nonrewarding experiences such as aversive and alerting events. Here we review recent advances in understanding the reward and nonreward functions of dopamine. Based on this data, we propose that dopamine neurons come in multiple types that are connected with distinct brain networks and have distinct roles in motivational control. Some dopamine neurons encode motivational value, supporting brain networks for seeking, evaluation, and value learning. Others encode motivational salience, supporting brain networks for orienting, cognition, and general motivation. Both types of dopamine neurons are augmented by an alerting signal involved in rapid detection of potentially important sensory cues. We hypothesize that these dopaminergic pathways for value, salience, and alerting cooperate to support adaptive behavior. 相似文献

20.

Absence of Spatial Tuning in the Orbitofrontal Cortex

Lauren E. Grattan Paul W. Glimcher 《PloS one》2014,9(11)

There is limited data in the literature to explicitly support the notion that neurons in OFC are truly action-independent in their coding. We set out to specifically test the hypothesis that OFC value-related neurons in area 13 m of the monkey do not carry information about the action required to obtain that reward – that activity in this area represents reward values in an abstract and action-independent manner. To accomplish that goal we had two monkeys select and execute saccadic eye movements to 81 locations in the visual field for three different kinds of juice rewards. Our detailed analysis of the response fields indicates that these neurons are insensitive to the amplitude or direction of the saccade required to obtain these rewards. Our data thus validate earlier proposals that neurons of 13 m in the OFC encode subjective value independent of the saccadic action required to obtain that reward. 相似文献