首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
The ventral striatum (VS), like its cortical afferents, is closely associated with processing of rewards, but the relative contributions of striatal and cortical reward systems remains unclear. Most theories posit distinct roles for these structures, despite their similarities. We compared responses of VS neurons to those of ventromedial prefrontal cortex (vmPFC) Area 14 neurons, recorded in a risky choice task. Five major response patterns observed in vmPFC were also observed in VS: (1) offer value encoding, (2) value difference encoding, (3) preferential encoding of chosen relative to unchosen value, (4) a correlation between residual variance in responses and choices, and (5) prominent encoding of outcomes. We did observe some differences as well; in particular, preferential encoding of the chosen option was stronger and started earlier in VS than in vmPFC. Nonetheless, the close match between vmPFC and VS suggests that cortex and its striatal targets make overlapping contributions to economic choice.  相似文献   

2.
We investigated the neural bases of navigation based on spatial or sequential egocentric representation during the completion of the starmaze, a complex goal-directed navigation task. In this maze, mice had to swim along a path composed of three choice points to find a hidden platform. As reported previously, this task can be solved by using two hippocampal-dependent strategies encoded in parallel i) the allocentric strategy requiring encoding of the contextual information, and ii) the sequential egocentric strategy requiring temporal encoding of a sequence of successive body movements associated to specific choice points. Mice were trained during one day and tested the following day in a single probe trial to reveal which of the two strategies was spontaneously preferred by each animal. Imaging of the activity-dependent gene c-fos revealed that both strategies are supported by an overlapping network involving the dorsal hippocampus, the dorsomedial striatum (DMS) and the medial prefrontal cortex. A significant higher activation of the ventral CA1 subregion was observed when mice used the sequential egocentric strategy. To investigate the potential different roles of the dorsal hippocampus and the DMS in both types of navigation, we performed region-specific excitotoxic lesions of each of these two structures. Dorsal hippocampus lesioned mice were unable to optimally learn the sequence but improved their performances by developing a serial strategy instead. DMS lesioned mice were severely impaired, failing to learn the task. Our data support the view that the hippocampus organizes information into a spatio-temporal representation, which can then be used by the DMS to perform goal-directed navigation.  相似文献   

3.
Getting formal with dopamine and reward   总被引:59,自引:0,他引:59  
Schultz W 《Neuron》2002,36(2):241-263
Recent neurophysiological studies reveal that neurons in certain brain structures carry specific signals about past and future rewards. Dopamine neurons display a short-latency, phasic reward signal indicating the difference between actual and predicted rewards. The signal is useful for enhancing neuronal processing and learning behavioral reactions. It is distinctly different from dopamine's tonic enabling of numerous behavioral processes. Neurons in the striatum, frontal cortex, and amygdala also process reward information but provide more differentiated information for identifying and anticipating rewards and organizing goal-directed behavior. The different reward signals have complementary functions, and the optimal use of rewards in voluntary behavior would benefit from interactions between the signals. Addictive psychostimulant drugs may exert their action by amplifying the dopamine reward signal.  相似文献   

4.
Daw ND  Gershman SJ  Seymour B  Dayan P  Dolan RJ 《Neuron》2011,69(6):1204-1215
The mesostriatal dopamine system is prominently implicated in model-free reinforcement learning, with fMRI BOLD signals in ventral striatum notably covarying with model-free prediction errors. However, latent learning and devaluation studies show that behavior also shows hallmarks of model-based planning, and the interaction between model-based and model-free values, prediction errors, and preferences is underexplored. We designed a multistep decision task in which model-based and model-free influences on human choice behavior could be distinguished. By showing that choices reflected both influences we could then test the purity of the ventral striatal BOLD signal as a model-free report. Contrary to expectations, the signal reflected both model-free and model-based predictions in proportions matching those that best explained choice behavior. These results challenge the notion of a separate model-free learner and suggest a more integrated computational architecture for high-level human decision-making.  相似文献   

5.
Cai X  Kim S  Lee D 《Neuron》2011,69(1):170-182
In choosing between different rewards expected after unequal delays, humans and animals often prefer the smaller but more immediate reward, indicating that the subjective value or utility of reward is depreciated according to its delay. Here, we show that neurons in the primate caudate nucleus and ventral striatum modulate their activity according to temporally discounted values of rewards with a similar time course. However, neurons in the caudate nucleus encoded the difference in the temporally discounted values of the two alternative targets more reliably than neurons in the ventral striatum. In contrast, neurons in the ventral striatum largely encoded the sum of the temporally discounted values, and therefore, the overall goodness of available options. These results suggest a more pivotal role for the dorsal striatum in action selection during intertemporal choice.  相似文献   

6.

Background

The ability to select an action by considering both delays and amount of reward outcome is critical for maximizing long-term benefits. Although previous animal experiments on impulsivity have suggested a role of serotonin in behaviors requiring prediction of delayed rewards, the underlying neural mechanism is unclear.

Methodology/Principal Findings

To elucidate the role of serotonin in the evaluation of delayed rewards, we performed a functional brain imaging experiment in which subjects chose small-immediate or large-delayed liquid rewards under dietary regulation of tryptophan, a precursor of serotonin. A model-based analysis revealed that the activity of the ventral part of the striatum was correlated with reward prediction at shorter time scales, and this correlated activity was stronger at low serotonin levels. By contrast, the activity of the dorsal part of the striatum was correlated with reward prediction at longer time scales, and this correlated activity was stronger at high serotonin levels.

Conclusions/Significance

Our results suggest that serotonin controls the time scale of reward prediction by differentially regulating activities within the striatum.  相似文献   

7.
The cognitive and neural mechanisms for recognizing and categorizing behavior are not well understood in non-human animals. In the current experiments, pigeons and humans learned to categorize two non-repeating, complex human behaviors (“martial arts” vs. “Indian dance”). Using multiple video exemplars of a digital human model, pigeons discriminated these behaviors in a go/no-go task and humans in a choice task. Experiment 1 found that pigeons already experienced with discriminating the locomotive actions of digital animals acquired the discrimination more rapidly when action information was available than when only pose information was available. Experiments 2 and 3 found this same dynamic superiority effect with naïve pigeons and human participants. Both species used the same combination of immediately available static pose information and more slowly perceived dynamic action cues to discriminate the behavioral categories. Theories based on generalized visual mechanisms, as opposed to embodied, species-specific action networks, offer a parsimonious account of how these different animals recognize behavior across and within species.  相似文献   

8.
Mental and physical efforts, such as paying attention and lifting weights, have been shown to involve different brain systems. These cognitive and motor systems, respectively, include cortical networks (prefronto-parietal and precentral regions) as well as subregions of the dorsal basal ganglia (caudate and putamen). Both systems appeared sensitive to incentive motivation: their activity increases when we work for higher rewards. Another brain system, including the ventral prefrontal cortex and the ventral basal ganglia, has been implicated in encoding expected rewards. How this motivational system drives the cognitive and motor systems remains poorly understood. More specifically, it is unclear whether cognitive and motor systems can be driven by a common motivational center or if they are driven by distinct, dedicated motivational modules. To address this issue, we used functional MRI to scan healthy participants while performing a task in which incentive motivation, cognitive, and motor demands were varied independently. We reasoned that a common motivational node should (1) represent the reward expected from effort exertion, (2) correlate with the performance attained, and (3) switch effective connectivity between cognitive and motor regions depending on task demand. The ventral striatum fulfilled all three criteria and therefore qualified as a common motivational node capable of driving both cognitive and motor regions of the dorsal striatum. Thus, we suggest that the interaction between a common motivational system and the different task-specific systems underpinning behavioral performance might occur within the basal ganglia.  相似文献   

9.
We studied the behavioral and emotional dynamics displayed by two people trying to resolve a conflict. 59 groups of two people were asked to talk for 20 minutes to try to reach a consensus about a topic on which they disagreed. The topics were abortion, affirmative action, death penalty, and euthanasia. Behavior data were determined from audio recordings where each second of the conversation was assessed as proself, neutral, or prosocial. We determined the probability density function of the durations of time spent in each behavioral state. These durations were well fit by a stretched exponential distribution, with an exponent, , of approximately 0.3. This indicates that the switching between behavioral states is not a random Markov process, but one where the probability to switch behavioral states decreases with the time already spent in that behavioral state. The degree of this “memory” was stronger in those groups who did not reach a consensus and where the conflict grew more destructive than in those that did. Emotion data were measured by having each person listen to the audio recording and moving a computer mouse to recall their negative or positive emotional valence at each moment in the conversation. We used the Hurst rescaled range analysis and power spectrum to determine the correlations in the fluctuations of the emotional valence. The emotional valence was well described by a random walk whose increments were uncorrelated. Thus, the behavior data demonstrated a “memory” of the duration already spent in a behavioral state while the emotion data fluctuated as a random walk whose steps did not have a “memory” of previous steps. This work demonstrates that statistical analysis, more commonly used to analyze physical phenomena, can also shed interesting light on the dynamics of processes in social psychology and conflict management.  相似文献   

10.
While there is a growing body of functional magnetic resonance imaging (fMRI) evidence implicating a corpus of brain regions in value-based decision-making in humans, the limited temporal resolution of fMRI cannot address the relative temporal precedence of different brain regions in decision-making. To address this question, we adopted a computational model-based approach to electroencephalography (EEG) data acquired during a simple binary choice task. fMRI data were also acquired from the same participants for source localization. Post-decision value signals emerged 200 ms post-stimulus in a predominantly posterior source in the vicinity of the intraparietal sulcus and posterior temporal lobe cortex, alongside a weaker anterior locus. The signal then shifted to a predominantly anterior locus 850 ms following the trial onset, localized to the ventromedial prefrontal cortex and lateral prefrontal cortex. Comparison signals between unchosen and chosen options emerged late in the trial at 1050 ms in dorsomedial prefrontal cortex, suggesting that such comparison signals may not be directly associated with the decision itself but rather may play a role in post-decision action selection. Taken together, these results provide us new insights into the temporal dynamics of decision-making in the brain, suggesting that for a simple binary choice task, decisions may be encoded predominantly in posterior areas such as intraparietal sulcus, before shifting anteriorly.  相似文献   

11.
Despite explicitly wanting to quit, long-term addicts find themselves powerless to resist drugs, despite knowing that drug-taking may be a harmful course of action. Such inconsistency between the explicit knowledge of negative consequences and the compulsive behavioral patterns represents a cognitive/behavioral conflict that is a central characteristic of addiction. Neurobiologically, differential cue-induced activity in distinct striatal subregions, as well as the dopamine connectivity spiraling from ventral striatal regions to the dorsal regions, play critical roles in compulsive drug seeking. However, the functional mechanism that integrates these neuropharmacological observations with the above-mentioned cognitive/behavioral conflict is unknown. Here we provide a formal computational explanation for the drug-induced cognitive inconsistency that is apparent in the addicts'' “self-described mistake”. We show that addictive drugs gradually produce a motivational bias toward drug-seeking at low-level habitual decision processes, despite the low abstract cognitive valuation of this behavior. This pathology emerges within the hierarchical reinforcement learning framework when chronic exposure to the drug pharmacologically produces pathologicaly persistent phasic dopamine signals. Thereby the drug hijacks the dopaminergic spirals that cascade the reinforcement signals down the ventro-dorsal cortico-striatal hierarchy. Neurobiologically, our theory accounts for rapid development of drug cue-elicited dopamine efflux in the ventral striatum and a delayed response in the dorsal striatum. Our theory also shows how this response pattern depends critically on the dopamine spiraling circuitry. Behaviorally, our framework explains gradual insensitivity of drug-seeking to drug-associated punishments, the blocking phenomenon for drug outcomes, and the persistent preference for drugs over natural rewards by addicts. The model suggests testable predictions and beyond that, sets the stage for a view of addiction as a pathology of hierarchical decision-making processes. This view is complementary to the traditional interpretation of addiction as interaction between habitual and goal-directed decision systems.  相似文献   

12.
What kind of strategies subjects follow in various behavioral circumstances has been a central issue in decision making. In particular, which behavioral strategy, maximizing or matching, is more fundamental to animal''s decision behavior has been a matter of debate. Here, we prove that any algorithm to achieve the stationary condition for maximizing the average reward should lead to matching when it ignores the dependence of the expected outcome on subject''s past choices. We may term this strategy of partial reward maximization “matching strategy”. Then, this strategy is applied to the case where the subject''s decision system updates the information for making a decision. Such information includes subject''s past actions or sensory stimuli, and the internal storage of this information is often called “state variables”. We demonstrate that the matching strategy provides an easy way to maximize reward when combined with the exploration of the state variables that correctly represent the crucial information for reward maximization. Our results reveal for the first time how a strategy to achieve matching behavior is beneficial to reward maximization, achieving a novel insight into the relationship between maximizing and matching.  相似文献   

13.
The insight that animals'' cognitive abilities are linked to their evolutionary history, and hence their ecology, provides the framework for the comparative approach. Despite primates renowned dietary complexity and social cognition, including cooperative abilities, we here demonstrate that cleaner wrasse outperform three primate species, capuchin monkeys, chimpanzees and orang-utans, in a foraging task involving a choice between two actions, both of which yield identical immediate rewards, but only one of which yields an additional delayed reward. The foraging task decisions involve partner choice in cleaners: they must service visiting client reef fish before resident clients to access both; otherwise the former switch to a different cleaner. Wild caught adult, but not juvenile, cleaners learned to solve the task quickly and relearned the task when it was reversed. The majority of primates failed to perform above chance after 100 trials, which is in sharp contrast to previous studies showing that primates easily learn to choose an action that yields immediate double rewards compared to an alternative action. In conclusion, the adult cleaners'' ability to choose a superior action with initially neutral consequences is likely due to repeated exposure in nature, which leads to specific learned optimal foraging decision rules.  相似文献   

14.
Human subjects are proficient at tracking the mean and variance of rewards and updating these via prediction errors. Here, we addressed whether humans can also learn about higher-order relationships between distinct environmental outcomes, a defining ecological feature of contexts where multiple sources of rewards are available. By manipulating the degree to which distinct outcomes are correlated, we show that subjects implemented an explicit model-based strategy to learn the associated outcome correlations and were adept in using that information to dynamically adjust their choices in a task that required a minimization of outcome variance. Importantly, the experimentally generated outcome correlations were explicitly represented neuronally in right midinsula with a learning prediction error signal expressed in rostral anterior cingulate cortex. Thus, our data show that the human brain represents higher-order correlation structures between rewards, a core adaptive ability whose immediate benefit is optimized sampling.  相似文献   

15.
The future is uncertain because some forthcoming events are unpredictable and also because our ability to foresee the myriad consequences of our own actions is limited. Here we studied how humans select actions under such extrinsic and intrinsic uncertainty, in view of an exponentially expanding number of prospects on a branching multivalued visual stimulus. A triangular grid of disks of different sizes scrolled down a touchscreen at a variable speed. The larger disks represented larger rewards. The task was to maximize the cumulative reward by touching one disk at a time in a rapid sequence, forming an upward path across the grid, while every step along the path constrained the part of the grid accessible in the future. This task captured some of the complexity of natural behavior in the risky and dynamic world, where ongoing decisions alter the landscape of future rewards. By comparing human behavior with behavior of ideal actors, we identified the strategies used by humans in terms of how far into the future they looked (their “depth of computation”) and how often they attempted to incorporate new information about the future rewards (their “recalculation period”). We found that, for a given task difficulty, humans traded off their depth of computation for the recalculation period. The form of this tradeoff was consistent with a complete, brute-force exploration of all possible paths up to a resource-limited finite depth. A step-by-step analysis of the human behavior revealed that participants took into account very fine distinctions between the future rewards and that they abstained from some simple heuristics in assessment of the alternative paths, such as seeking only the largest disks or avoiding the smaller disks. The participants preferred to reduce their depth of computation or increase the recalculation period rather than sacrifice the precision of computation.  相似文献   

16.
Pavlovian associations drive approach towards reward-predictive cues, and avoidance of punishment-predictive cues. These associations “misbehave” when they conflict with correct instrumental behavior. This raises the question of how Pavlovian and instrumental influences on behavior are arbitrated. We test a computational theory according to which Pavlovian influence will be stronger when inferred controllability of outcomes is low. Using a model-based analysis of a Go/NoGo task with human subjects, we show that theta-band oscillatory power in frontal cortex tracks inferred controllability, and that these inferences predict Pavlovian action biases. Functional MRI data revealed an inferior frontal gyrus correlate of action probability and a ventromedial prefrontal correlate of outcome valence, both of which were modulated by inferred controllability.  相似文献   

17.
Value representations in the primate striatum during matching behavior   总被引:1,自引:0,他引:1  
Lau B  Glimcher PW 《Neuron》2008,58(3):451-463
Choosing the most valuable course of action requires knowing the outcomes associated with the available alternatives. The striatum may be important for representing the values of actions. We examined this in monkeys performing an oculomotor choice task. The activity of phasically active neurons (PANs) in the striatum covaried with two classes of information: action-values and chosen-values. Action-value PANs were correlated with value estimates for one of the available actions, and these signals were frequently observed before movement execution. Chosen-value PANs were correlated with the value of the action that had been chosen, and these signals were primarily observed later in the task, immediately before or persistently after movement execution. These populations may serve distinct functions mediated by the striatum: some PANs may participate in choice by encoding the values of the available actions, while other PANs may participate in evaluative updating by encoding the reward value of chosen actions.  相似文献   

18.
Organisms prefer to make their own choices. However, emerging research from behavioral decision making sciences has demonstrated that there are boundaries to the preference for choice. Specifically, many decision makers find an extensive array of choice options to be aversive, often leading to negative emotional states and poor behavioral outcomes. This study examined the degree to which human participants discounted hypothetical rewards that were (a) delayed, (b) probabilistic, and (c) chosen from a large array of options. The present results suggest that the "paradox of choice" effect may be explained within a discounting model for individual patterns of decision making.  相似文献   

19.
Kim S  Hwang J  Lee D 《Neuron》2008,59(1):161-172
Reward from a particular action is seldom immediate, and the influence of such delayed outcome on choice decreases with delay. It has been postulated that when faced with immediate and delayed rewards, decision makers choose the option with maximum temporally discounted value. We examined the preference of monkeys for delayed reward in an intertemporal choice task and the neural basis for real-time computation of temporally discounted values in the dorsolateral prefrontal cortex. During this task, the locations of the targets associated with small or large rewards and their corresponding delays were randomly varied. We found that prefrontal neurons often encoded the temporally discounted value of reward expected from a particular option. Furthermore, activity tended to increase with [corrected] discounted values for targets [corrected] presented in the neuron's preferred direction, suggesting that activity related to temporally discounted values in the prefrontal cortex might determine the animal's behavior during intertemporal choice.  相似文献   

20.
L Ding  JI Gold 《Neuron》2012,75(5):865-874
In contrast to the well-established roles of the striatum in movement generation and value-based decisions, its contributions to perceptual decisions lack direct experimental support. Here, we show that electrical microstimulation in the monkey caudate nucleus influences both choice and saccade response time on a visual motion discrimination task. Within a drift-diffusion framework, these effects consist of two components. The perceptual component biases choices toward ipsilateral targets, away from the neurons' predominantly contralateral response fields. The choice bias is consistent with a nonzero starting value of the diffusion process, which increases and decreases decision times for contralateral and ipsilateral choices, respectively. The nonperceptual component decreases and increases nondecision times toward contralateral and ipsilateral targets, respectively, consistent with the caudate's role in saccade generation. The results imply a causal role for the caudate in perceptual decisions used to select saccades that may be distinct from its role in executing those saccades. VIDEO ABSTRACT:  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号