首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Saccade reward signals in posterior cingulate cortex   总被引:7,自引:0,他引:7  
McCoy AN  Crowley JC  Haghighian G  Dean HL  Platt ML 《Neuron》2003,40(5):1031-1040
Movement selection depends on the outcome of prior behavior. Posterior cingulate cortex (CGp) is strongly connected with both limbic and oculomotor circuitry, and CGp neurons respond following saccades, suggesting a role in signaling the motivational outcome of gaze shifts. To test this hypothesis, single CGp neurons were studied in monkeys while they shifted gaze to visual targets for liquid rewards that varied in size or were delivered probabilistically. CGp neurons responded following saccades as well as following reward delivery, and these responses were correlated with reward size. CGp neurons also responded following the omission of predicted rewards. The timing of CGp activation and its modulation by reward could provide signals useful for updating representations of expected saccade value.  相似文献   

2.
The fundamental biological importance of rewards has created an increasing interest in the neuronal processing of reward information. The suggestion that the mechanisms underlying drug addiction might involve natural reward systems has also stimulated interest. This article focuses on recent neurophysiological studies in primates that have revealed that neurons in a limited number of brain structures carry specific signals about past and future rewards. This research provides the first step towards an understanding of how rewards influence behaviour before they are received and how the brain might use reward information to control learning and goal-directed behaviour.  相似文献   

3.
Getting formal with dopamine and reward   总被引:59,自引:0,他引:59  
Schultz W 《Neuron》2002,36(2):241-263
Recent neurophysiological studies reveal that neurons in certain brain structures carry specific signals about past and future rewards. Dopamine neurons display a short-latency, phasic reward signal indicating the difference between actual and predicted rewards. The signal is useful for enhancing neuronal processing and learning behavioral reactions. It is distinctly different from dopamine's tonic enabling of numerous behavioral processes. Neurons in the striatum, frontal cortex, and amygdala also process reward information but provide more differentiated information for identifying and anticipating rewards and organizing goal-directed behavior. The different reward signals have complementary functions, and the optimal use of rewards in voluntary behavior would benefit from interactions between the signals. Addictive psychostimulant drugs may exert their action by amplifying the dopamine reward signal.  相似文献   

4.
Roesch MR  Taylor AR  Schoenbaum G 《Neuron》2006,51(4):509-520
We monitored single-neuron activity in the orbitofrontal cortex of rats performing a time-discounting task in which the spatial location of the reward predicted whether the delay preceding reward delivery would be short or long. We found that rewards delivered after a short delay elicited a stronger neuronal response than those delivered after a long delay in most neurons. Activity in these neurons was not influenced by reward size when delays were held constant. This was also true for a minority of neurons that exhibited sustained increases in firing in anticipation of delayed reward. Thus, encoding of time-discounted rewards in orbitofrontal cortex is independent of the encoding of absolute reward value. These results are contrary to the proposal that orbitofrontal neurons signal the value of delayed rewards in a common currency and instead suggest alternative proposals for the role this region plays in guiding responses for delayed versus immediate rewards.  相似文献   

5.
To accurately predict rewards associated with states or actions, the variability of observations has to be taken into account. In particular, when the observations are noisy, the individual rewards should have less influence on tracking of average reward, and the estimate of the mean reward should be updated to a smaller extent after each observation. However, it is not known how the magnitude of the observation noise might be tracked and used to control prediction updates in the brain reward system. Here, we introduce a new model that uses simple, tractable learning rules that track the mean and standard deviation of reward, and leverages prediction errors scaled by uncertainty as the central feedback signal. We show that the new model has an advantage over conventional reinforcement learning models in a value tracking task, and approaches a theoretic limit of performance provided by the Kalman filter. Further, we propose a possible biological implementation of the model in the basal ganglia circuit. In the proposed network, dopaminergic neurons encode reward prediction errors scaled by standard deviation of rewards. We show that such scaling may arise if the striatal neurons learn the standard deviation of rewards and modulate the activity of dopaminergic neurons. The model is consistent with experimental findings concerning dopamine prediction error scaling relative to reward magnitude, and with many features of striatal plasticity. Our results span across the levels of implementation, algorithm, and computation, and might have important implications for understanding the dopaminergic prediction error signal and its relation to adaptive and effective learning.  相似文献   

6.
Human research in delay discounting has omitted several procedures typical of animal studies: forced-choice trials, consequences following each response, and assessment of stable response patterns. The present study manipulated these procedures across two conditions in which real or hypothetical rewards were arranged. Six college students participated in daily sessions, in which steady-state discounting of hypothetical and real rewards was assessed. No systematic effects of repeated exposure to hypothetical rewards was detected when compared with first day assessments of discounting. Likewise, no systematic effect of reward type (real versus hypothetical) was detected. When combined with previous research failing to detect a difference between hypothetical and potentially real rewards, these findings suggest that assessing discounting of hypothetical rewards in single sessions is a practical and valid procedure in the study of delay discounting.  相似文献   

7.
The acknowledged importance of uncertainty in economic decision making has stimulated the search for neural signals that could influence learning and inform decision mechanisms. Current views distinguish two forms of uncertainty, namely risk and ambiguity, depending on whether the probability distributions of outcomes are known or unknown. Behavioural neurophysiological studies on dopamine neurons revealed a risk signal, which covaried with the standard deviation or variance of the magnitude of juice rewards and occurred separately from reward value coding. Human imaging studies identified similarly distinct risk signals for monetary rewards in the striatum and orbitofrontal cortex (OFC), thus fulfilling a requirement for the mean variance approach of economic decision theory. The orbitofrontal risk signal covaried with individual risk attitudes, possibly explaining individual differences in risk perception and risky decision making. Ambiguous gambles with incomplete probabilistic information induced stronger brain signals than risky gambles in OFC and amygdala, suggesting that the brain's reward system signals the partial lack of information. The brain can use the uncertainty signals to assess the uncertainty of rewards, influence learning, modulate the value of uncertain rewards and make appropriate behavioural choices between only partly known options.  相似文献   

8.
Sensitivity to time, including the time of reward, guides the behaviour of all organisms. Recent research suggests that all major reward structures of the brain process the time of reward occurrence, including midbrain dopamine neurons, striatum, frontal cortex and amygdala. Neuronal reward responses in dopamine neurons, striatum and frontal cortex show temporal discounting of reward value. The prediction error signal of dopamine neurons includes the predicted time of rewards. Neurons in the striatum, frontal cortex and amygdala show responses to reward delivery and activities anticipating rewards that are sensitive to the predicted time of reward and the instantaneous reward probability. Together these data suggest that internal timing processes have several well characterized effects on neuronal reward processing.  相似文献   

9.
Midbrain dopamine neurons are an essential part of the circuitry underlying motivation and reinforcement. They are activated by rewards or reward-predicting cues and inhibited by reward omission. The lateral habenula (lHb), an epithalamic structure that forms reciprocal connections with midbrain dopamine neurons, shows the opposite response being activated by reward omission or aversive stimuli and inhibited by reward-predicting cues. It has been hypothesized that habenular input to midbrain dopamine neurons is conveyed via a feedforward inhibitory pathway involving the GABAergic mesopontine rostromedial tegmental area. Here, we show that exposing rats to low-intensity footshock (four, 0.5 mA shocks over 20 min) induces cFos expression in the rostromedial tegmental area and that this effect is prevented by lesions of the fasciculus retroflexus, the principal output pathway of the habenula. cFos expression is also observed in the medial portion of the lateral habenula, an area that receives dense DA innervation via the fr and the paraventricular nucleus of the thalamus, a stress sensitive area that also receives dopaminergic input. High-intensity footshock (120, 0.8 mA shocks over 40 min) also elevates cFos expression in the rostromedial tegmental area, medial and lateral aspects of the lateral habenula and the paraventricular thalamus. In contrast to low-intensity footshock, increases in cFos expression within the rostromedial tegmental area are not altered by fr lesions suggesting a role for non-habenular inputs during exposure to highly aversive stimuli. These data confirm the involvement of the lateral habenula in modulating the activity of rostromedial tegmental area neurons in response to mild aversive stimuli and suggest that dopamine input may contribute to footshock- induced activation of cFos expression in the lateral habenula.  相似文献   

10.
Neurons in a small number of brain structures detect rewards and reward-predicting stimuli and are active during the expectation of predictable food and liquid rewards. These neurons code the reward information according to basic terms of various behavioural theories that seek to explain reward-directed learning, approach behaviour and decision-making. The involved brain structures include groups of dopamine neurons, the striatum including the nucleus accumbens, the orbitofrontal cortex and the amygdala. The reward information is fed to brain structures involved in decision-making and organisation of behaviour, such as the dorsolateral prefrontal cortex and possibly the parietal cortex. The neural coding of basic reward terms derived from formal theories puts the neurophysiological investigation of reward mechanisms on firm conceptual grounds and provides neural correlates for the function of rewards in learning, approach behaviour and decision-making.  相似文献   

11.
Reward prediction error (RPE) signals are central to current models of reward-learning. Temporal difference (TD) learning models posit that these signals should be modulated by predictions, not only of magnitude but also timing of reward. Here we show that BOLD activity in the VTA conforms to such TD predictions: responses to unexpected rewards are modulated by a temporal hazard function and activity between a predictive stimulus and reward is depressed in proportion to predicted reward. By contrast, BOLD activity in ventral striatum (VS) does not reflect a TD RPE, but instead encodes a signal on the variable relevant for behavior, here timing but not magnitude of reward. The results have important implications for dopaminergic models of cortico-striatal learning and suggest a modification of the conventional view that VS BOLD necessarily reflects inputs from dopaminergic VTA neurons signaling an RPE.  相似文献   

12.
Nahum L  Gabriel D  Schnider A 《PloS one》2011,6(1):e16173
Acute lesions of the posterior medial orbitofrontal cortex (OFC) in humans may induce a state of reality confusion marked by confabulation, disorientation, and currently inappropriate actions. This clinical state is strongly associated with an inability to abandon previously valid anticipations, that is, extinction capacity. In healthy subjects, the filtering of memories according to their relation with ongoing reality is associated with activity in posterior medial OFC (area 13) and electrophysiologically expressed at 220-300 ms. These observations indicate that the human OFC also functions as a generic reality monitoring system. For this function, it is presumably more important for the OFC to evaluate the current behavioral appropriateness of anticipations rather than their hedonic value. In the present study, we put this hypothesis to the test. Participants performed a reversal learning task with intermittent absence of reward delivery. High-density evoked potential analysis showed that the omission of expected reward induced a specific electrocortical response in trials signaling the necessity to abandon the hitherto reward predicting choice, but not when omission of reward had no such connotation. This processing difference occurred at 200-300 ms. Source estimation using inverse solution analysis indicated that it emanated from the posterior medial OFC. We suggest that the human brain uses this signal from the OFC to keep thought and behavior in phase with reality.  相似文献   

13.
Some flowering plants signal the abundance of their rewards by changing their flower colour, scent or other floral traits as rewards are depleted. These floral trait changes can be regarded as honest signals of reward states for pollinators. Previous studies have hypothesized that these signals are used to maintain plant-level attractiveness to pollinators, but the evolutionary conditions leading to the development of honest signals have not been well investigated from a theoretical basis. We examined conditions leading to the evolution of honest reward signals in flowers by applying a theoretical model that included pollinator response and signal accuracy. We assumed that pollinators learn floral traits and plant locations in association with reward states and use this information to decide which flowers to visit. While manipulating the level of associative learning, we investigated optimal flower longevity, the proportion of reward and rewardless flowers, and honest- and dishonest-signalling strategies. We found that honest signals are evolutionarily stable only when flowers are visited by pollinators with both high and low learning abilities. These findings imply that behavioural variation in learning within a pollinator community can lead to the evolution of an honest signal even when there is no contribution of rewardless flowers to pollinator attractiveness.  相似文献   

14.
There is limited data in the literature to explicitly support the notion that neurons in OFC are truly action-independent in their coding. We set out to specifically test the hypothesis that OFC value-related neurons in area 13 m of the monkey do not carry information about the action required to obtain that reward – that activity in this area represents reward values in an abstract and action-independent manner. To accomplish that goal we had two monkeys select and execute saccadic eye movements to 81 locations in the visual field for three different kinds of juice rewards. Our detailed analysis of the response fields indicates that these neurons are insensitive to the amplitude or direction of the saccade required to obtain these rewards. Our data thus validate earlier proposals that neurons of 13 m in the OFC encode subjective value independent of the saccadic action required to obtain that reward.  相似文献   

15.
Many aspects of hedonic behavior, including self-administration of natural and drug rewards, as well as human positive affect, follow a diurnal cycle that peaks during the species-specific active period. This variation has been linked to circadian modulation of the mesolimbic dopamine system, and is hypothesized to serve an adaptive function by driving an organism to engage with the environment during times where the opportunity for obtaining rewards is high. However, relatively little is known about whether more complex facets of hedonic behavior – in particular, reward learning – follow the same diurnal cycle. The current study aimed to address this gap by examining evidence for diurnal variation in reward learning on a well-validated probabilistic reward learning task (PRT). PRT data from a large normative sample (= 516) of non-clinical individuals, recruited across eight studies, were examined for the current study. The PRT uses an asymmetrical reinforcement ratio to induce a behavioral response bias, and reward learning was operationalized as the strength of this response bias across blocks of the task. Results revealed significant diurnal variation in reward learning, however in contrast to patterns previously observed in other aspects of hedonic behavior, reward learning was lowest in the middle of the day. Although a diurnal pattern was also observed on a measure of more general task performance (discriminability), this did not account for the variation observed in reward learning. Taken together, these findings point to a distinct diurnal pattern in reward learning that differs from that observed in other aspects of hedonic behavior. The results of this study have important implications for our understanding of clinical disorders characterized by both circadian and reward learning disturbances, and future research is needed to confirm whether this diurnal variation has a truly circadian origin.  相似文献   

16.
奖赏加工异常是网络游戏成瘾(internet gaming disorder,IGD)核心特征之一,近期研究结合认知神经科学技术对IGD的神经机制进行探讨发现,IGD与药物成瘾存在相似的神经基础,但仍然存在较多争议。本文从奖赏类型、奖赏加工阶段梳理了IGD人群奖赏加工的研究进展。IGD人群在游戏相关线索下,奖赏预期(reward anticipation)阶段的奖赏系统激活可能与注意偏向、情绪体验和渴求感增加有关。同时,IGD人群对自然奖赏表现出较为一致的低敏感性,该特征主要出现在结果评估(outcome evaluation)阶段。未来研究可以排除共病因素、结合生动的游戏奖赏刺激,进一步探究奖赏预期和结果评估异常如何推动IGD的发展。  相似文献   

17.
Han J  Li YH  Bai YJ  Sui N 《生理科学进展》2007,38(4):327-330
下丘脑是调控自然奖赏的重要脑区,它能特异性地表达一种神经肽——食欲素(orexin),这种神经肽在药物奖赏中的作用受到广泛关注。在成瘾研究中,发现不同脑区中的食欲素神经元对奖赏和动机行为的调节作用是不相同的:围穹窿区(PFA)和背内侧下丘脑区(DMH)的食欲素神经元主要参与激活应激系统,而外侧下丘脑(LH)的食欲素神经元主要通过激活与奖赏学习相关的大脑环路参与奖赏行为的调控。提示食欲素系统可在延长戒断防止复吸发生中成为新的研究目标,食欲素受体可以作为治疗药物成瘾的一种新的治疗靶标。  相似文献   

18.
Modulation of caudate activity by action contingency   总被引:5,自引:0,他引:5  
Tricomi EM  Delgado MR  Fiez JA 《Neuron》2004,41(2):281-292
Research has increasingly implicated the striatum in the processing of reward-related information in both animals and humans. However, it is unclear whether human striatal activation is driven solely by the hedonic properties of rewards or whether such activation is reliant on other factors, such as anticipation of upcoming reward or performance of an action to earn a reward. We used event-related functional magnetic resonance imaging to investigate hemodynamic responses to monetary rewards and punishments in three experiments that made use of an oddball paradigm. We presented reward and punishment displays randomly in time, following an anticipatory cue, or following a button press response. Robust and differential activation of the caudate nucleus occurred only when a perception of contingency existed between the button press response and the outcome. This finding suggests that the caudate is involved in reinforcement of action potentially leading to reward, rather than in processing reward per se.  相似文献   

19.
Midbrain dopamine neurons encode a quantitative reward prediction error signal   总被引:15,自引:0,他引:15  
Bayer HM  Glimcher PW 《Neuron》2005,47(1):129-141
  相似文献   

20.
A neural network model of how dopamine and prefrontal cortex activity guides short- and long-term information processing within the cortico-striatal circuits during reward-related learning of approach behavior is proposed. The model predicts two types of reward-related neuronal responses generated during learning: (1) cell activity signaling errors in the prediction of the expected time of reward delivery and (2) neural activations coding for errors in the prediction of the amount and type of reward or stimulus expectancies. The former type of signal is consistent with the responses of dopaminergic neurons, while the latter signal is consistent with reward expectancy responses reported in the prefrontal cortex. It is shown that a neural network architecture that satisfies the design principles of the adaptive resonance theory of Carpenter and Grossberg (1987) can account for the dopamine responses to novelty, generalization, and discrimination of appetitive and aversive stimuli. These hypotheses are scrutinized via simulations of the model in relation to the delivery of free food outside a task, the timed contingent delivery of appetitive and aversive stimuli, and an asymmetric, instructed delay response task.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号