期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Basolateral amygdala neurons facilitate reward-seeking behavior by exciting nucleus accumbens neurons

Ambroggi F Ishikawa A Fields HL Nicola SM 《Neuron》2008,59(4):648-661

Both the nucleus accumbens (NAc) and basolateral amygdala (BLA) contribute to learned behavioral choice. Neurons in both structures that encode reward-predictive cues may underlie the decision to respond to such cues, but the neural circuits by which the BLA influences reward-seeking behavior have not been established. Here, we test the hypothesis that the BLA drives NAc neuronal responses to reward-predictive cues. First, using a disconnection experiment, we show that the BLA and dopamine projections to the NAc interact to promote the reward-seeking behavioral response. Next, we demonstrate that BLA neuronal responses to cues precede those of NAc neurons and that cue-evoked excitation of NAc neurons depends on BLA input. These results indicate that BLA input is required for dopamine to enhance the cue-evoked firing of NAc neurons and that this enhanced firing promotes reward-seeking behavior. 相似文献

2.

Midbrain dopamine neurons encode a quantitative reward prediction error signal 总被引：15，自引：0，他引：15

Bayer HM Glimcher PW 《Neuron》2005,47(1):129-141

相似文献

3.

Reinforcement Learning of Targeted Movement in a Spiking Neuronal Model of Motor Cortex

George L. Chadderdon Samuel A. Neymotin Cliff C. Kerr William W. Lytton 《PloS one》2012,7(10)

Sensorimotor control has traditionally been considered from a control theory perspective, without relation to neurobiology. In contrast, here we utilized a spiking-neuron model of motor cortex and trained it to perform a simple movement task, which consisted of rotating a single-joint “forearm” to a target. Learning was based on a reinforcement mechanism analogous to that of the dopamine system. This provided a global reward or punishment signal in response to decreasing or increasing distance from hand to target, respectively. Output was partially driven by Poisson motor babbling, creating stochastic movements that could then be shaped by learning. The virtual forearm consisted of a single segment rotated around an elbow joint, controlled by flexor and extensor muscles. The model consisted of 144 excitatory and 64 inhibitory event-based neurons, each with AMPA, NMDA, and GABA synapses. Proprioceptive cell input to this model encoded the 2 muscle lengths. Plasticity was only enabled in feedforward connections between input and output excitatory units, using spike-timing-dependent eligibility traces for synaptic credit or blame assignment. Learning resulted from a global 3-valued signal: reward (+1), no learning (0), or punishment (−1), corresponding to phasic increases, lack of change, or phasic decreases of dopaminergic cell firing, respectively. Successful learning only occurred when both reward and punishment were enabled. In this case, 5 target angles were learned successfully within 180 s of simulation time, with a median error of 8 degrees. Motor babbling allowed exploratory learning, but decreased the stability of the learned behavior, since the hand continued moving after reaching the target. Our model demonstrated that a global reinforcement signal, coupled with eligibility traces for synaptic plasticity, can train a spiking sensorimotor network to perform goal-directed motor behavior. 相似文献

4.

Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates

Robert R. Kerr David B. Grayden Doreen A. Thomas Matthieu Gilson Anthony N. Burkitt 《PloS one》2014,9(1)

A fundamental goal of neuroscience is to understand how cognitive processes, such as operant conditioning, are performed by the brain. Typical and well studied examples of operant conditioning, in which the firing rates of individual cortical neurons in monkeys are increased using rewards, provide an opportunity for insight into this. Studies of reward-modulated spike-timing-dependent plasticity (RSTDP), and of other models such as R-max, have reproduced this learning behavior, but they have assumed that no unsupervised learning is present (i.e., no learning occurs without, or independent of, rewards). We show that these models cannot elicit firing rate reinforcement while exhibiting both reward learning and ongoing, stable unsupervised learning. To fix this issue, we propose a new RSTDP model of synaptic plasticity based upon the observed effects that dopamine has on long-term potentiation and depression (LTP and LTD). We show, both analytically and through simulations, that our new model can exhibit unsupervised learning and lead to firing rate reinforcement. This requires that the strengthening of LTP by the reward signal is greater than the strengthening of LTD and that the reinforced neuron exhibits irregular firing. We show the robustness of our findings to spike-timing correlations, to the synaptic weight dependence that is assumed, and to changes in the mean reward. We also consider our model in the differential reinforcement of two nearby neurons. Our model aligns more strongly with experimental studies than previous models and makes testable predictions for future experiments. 相似文献

5.

Neural mechanisms of reward-related motor learning 总被引：10，自引：0，他引：10

Wickens JR Reynolds JN Hyland BI 《Current opinion in neurobiology》2003,13(6):685-690

The analysis of the neural mechanisms responsible for reward-related learning has benefited from recent studies of the effects of dopamine on synaptic plasticity. Dopamine-dependent synaptic plasticity may lead to strengthening of selected inputs on the basis of an activity-dependent conjunction of sensory afferent activity, motor output activity, and temporally related firing of dopamine cells. Such plasticity may provide a link between the reward-related firing of dopamine cells and the acquisition of changes in striatal cell activity during learning. This learning mechanism may play a special role in the translation of reward signals into context-dependent response probability or directional bias in movement responses. 相似文献

6.

Coordinated accumbal dopamine release and neural activity drive goal-directed behavior

Cheer JF Aragona BJ Heien ML Seipel AT Carelli RM Wightman RM 《Neuron》2007,54(2):237-244

Intracranial self-stimulation (ICSS) activates the neural pathways that mediate reward, including dopaminergic terminal areas such as the nucleus accumbens (NAc). However, a direct role of dopamine in ICSS-mediated reward has been questioned. Here, simultaneous voltammetric and electrophysiological recordings from the same electrode reveal that, at certain sites, the onset of anticipatory dopamine surges and changes in neuronal firing patterns during ICSS are coincident, whereas sites lacking dopamine changes also lack patterned firing. Intrashell microinfusion of a D1, but not a D2 receptor antagonist, blocks ICSS. An iontophoresis approach was implemented to explore the effect of dopamine antagonists on firing patterns without altering behavior. Similar to the microinfusion experiments, ICSS-related firing is selectively attenuated following D1 receptor blockade. This work establishes a temporal link between anticipatory rises of dopamine and firing patterns in the NAc shell during ICSS and suggests that they may play a similar role with natural rewards and during drug self-administration. 相似文献

7.

Cerebellar complex spike firing is suitable to induce as well as to stabilize motor learning

Catz N Dicke PW Thier P 《Current biology : CB》2005,15(24):2179-2189

BACKGROUND: Cerebellar Purkinje cells (PC) generate two responses: the simple spike (SS), with high firing rates (>100 Hz), and the complex spike (CS), characterized by conspicuously low discharge rates (1-2 Hz). Contemporary theories of cerebellar learning suggest that the CS discharge pattern encodes an error signal that drives changes in SS activity, ultimately related to motor behavior. This then predicts that CS will discharge in relation to the error and at random once the error has been nulled by the new behavior. RESULTS: We tested this hypothesis with saccadic adaptation in macaque monkeys as a model of cerebellar-dependent motor learning. During saccadic adaptation, error information unconsciously changes the endpoint of a saccade prompted by a visual target that shifts its final position during the saccade. We recorded CS from PC of the posterior vermis before, during, and after saccadic adaptation. In clear contradiction to the "error signal" concept, we found that CS occurred at random before adaptation onset, i.e., when the error was maximal, and built up to a specific saccade-related discharge profile during the course of adaptation. This profile became most pronounced at the end of adaptation, i.e., when the error had been nulled. CONCLUSIONS: We suggest that CS firing may underlie the stabilization of a learned motor behavior, rather than serving as an electrophysiological correlate of an error. 相似文献

8.

Neural encoding in ventral striatum during olfactory discrimination learning 总被引：9，自引：0，他引：9

Setlow B Schoenbaum G Gallagher M 《Neuron》2003,38(4):625-636

A growing body of evidence implicates the ventral striatum in using information acquired through associative learning. The present study examined the activity of ventral striatal neurons in awake, behaving rats during go/no-go odor discrimination learning and reversal. Many neurons fired selectively to odor cues predictive of either appetitive (sucrose) or aversive (quinine) outcomes. Few neurons were selective when first exposed to the odors, but many acquired this differential activity as rats learned the significance of the cues. A substantial proportion of these neurons encoded the cues' learned motivational significance, and these neurons tended to reverse their firing selectivity after reversal of odor-outcome contingencies. Other neurons that became selectively activated during learning did not reverse, but instead appeared to encode specific combinations of cues and associated motor responses. The results support a role for ventral striatum in using the learned significance, both appetitive and aversive, of predictive cues to guide behavior. 相似文献

9.

An imperfect dopaminergic error signal can drive temporal-difference learning

Potjans W Diesmann M Morrison A 《PLoS computational biology》2011,7(5):e1001133

An open problem in the field of computational neuroscience is how to link synaptic plasticity to system-level learning. A promising framework in this context is temporal-difference (TD) learning. Experimental evidence that supports the hypothesis that the mammalian brain performs temporal-difference learning includes the resemblance of the phasic activity of the midbrain dopaminergic neurons to the TD error and the discovery that cortico-striatal synaptic plasticity is modulated by dopamine. However, as the phasic dopaminergic signal does not reproduce all the properties of the theoretical TD error, it is unclear whether it is capable of driving behavior adaptation in complex tasks. Here, we present a spiking temporal-difference learning model based on the actor-critic architecture. The model dynamically generates a dopaminergic signal with realistic firing rates and exploits this signal to modulate the plasticity of synapses as a third factor. The predictions of our proposed plasticity dynamics are in good agreement with experimental results with respect to dopamine, pre- and post-synaptic activity. An analytical mapping from the parameters of our proposed plasticity dynamics to those of the classical discrete-time TD algorithm reveals that the biological constraints of the dopaminergic signal entail a modified TD algorithm with self-adapting learning parameters and an adapting offset. We show that the neuronal network is able to learn a task with sparse positive rewards as fast as the corresponding classical discrete-time TD algorithm. However, the performance of the neuronal network is impaired with respect to the traditional algorithm on a task with both positive and negative rewards and breaks down entirely on a task with purely negative rewards. Our model demonstrates that the asymmetry of a realistic dopaminergic signal enables TD learning when learning is driven by positive rewards but not when driven by negative rewards. 相似文献

10.

State-dependency in C. elegans

Bettinger JC McIntire SL 《Genes, Brain & Behavior》2004,3(5):266-272

Memory and the expression of learned behaviors by an organism are often triggered by contextual cues that resemble those that were present when the initial learning occurred. In state-dependent learning, the cue eliciting a learned behavior is a neuroactive drug; behaviors initially learned during exposure to centrally acting compounds such as ethanol are subsequently recalled better if the drug stimulus is again present during testing. Although state-dependent learning is well documented in many vertebrate systems, the molecular mechanisms underlying state-dependent learning and other forms of contextual learning are not understood. Here we demonstrate and present a genetic analysis of state- dependent adaptation in Caenorhabditis elegans. C. elegans normally exhibits adaptation, or reduced behavioral response, to an olfactory stimulus after prior exposure to the stimulus. If the adaptation to the olfactory stimulus is acquired during ethanol administration, the adaptation is subsequently displayed only if the ethanol stimulus is again present. cat-1 and cat-2 mutant animals are defective in dopaminergic neuron signaling and are impaired in state dependency, indicating that dopamine functions in state-dependent adaptation in C. elegans. 相似文献

11.

Parallel reinforcement pathways for conditioned food aversions in the honeybee

Wright GA Mustard JA Simcock NK Ross-Taylor AA McNicholas LD Popescu A Marion-Poll F 《Current biology : CB》2010,20(24):2234-2240

Avoiding toxins in food is as important as obtaining nutrition. Conditioned food aversions have been studied in animals as diverse as nematodes and humans [1, 2], but the neural signaling mechanisms underlying this form of learning have been difficult to pinpoint. Honeybees quickly learn to associate floral cues with food [3], a trait that makes them an excellent model organism for studying the neural mechanisms of learning and memory. Here we show that honeybees not only detect toxins but can also learn to associate odors with both the taste of toxins and the postingestive consequences of consuming them. We found that two distinct monoaminergic pathways mediate learned food aversions in the honeybee. As for other insect species conditioned with salt or electric shock reinforcers [4-7], learned avoidances of odors paired with bad-tasting toxins are mediated by dopamine. Our experiments are the first to identify a second, postingestive pathway for learned olfactory aversions that involves serotonin. This second pathway may represent an ancient mechanism for food aversion learning conserved across animal lineages. 相似文献

12.

Convergent processing of both positive and negative motivational signals by the VTA dopamine neuronal populations

Wang DV Tsien JZ 《PloS one》2011,6(2):e17047

Dopamine neurons in the ventral tegmental area (VTA) have been traditionally studied for their roles in reward-related motivation or drug addiction. Here we study how the VTA dopamine neuron population may process fearful and negative experiences as well as reward information in freely behaving mice. Using multi-tetrode recording, we find that up to 89% of the putative dopamine neurons in the VTA exhibit significant activation in response to the conditioned tone that predict food reward, while the same dopamine neuron population also respond to the fearful experiences such as free fall and shake events. The majority of these VTA putative dopamine neurons exhibit suppression and offset-rebound excitation, whereas ～25% of the recorded putative dopamine neurons show excitation by the fearful events. Importantly, VTA putative dopamine neurons exhibit parametric encoding properties: their firing change durations are proportional to the fearful event durations. In addition, we demonstrate that the contextual information is crucial for these neurons to respectively elicit positive or negative motivational responses by the same conditioned tone. Taken together, our findings suggest that VTA dopamine neurons may employ the convergent encoding strategy for processing both positive and negative experiences, intimately integrating with cues and environmental context. 相似文献

13.

Adolescent Changes in Dopamine D1 Receptor Expression in Orbitofrontal Cortex and Piriform Cortex Accompany an Associative Learning Deficit

Anna K. Garske Chloe R. Lawyer Brittni M. Peterson Kurt R. Illig 《PloS one》2013,8(2)

The orbitofrontal cortex (OFC) and piriform cortex are involved in encoding the predictive value of olfactory stimuli in rats, and neural responses to olfactory stimuli in these areas change as associations are learned. This experience-dependent plasticity mirrors task-related changes previously observed in mesocortical dopamine neurons, which have been implicated in learning the predictive value of cues. Although forms of associative learning can be found at all ages, cortical dopamine projections do not mature until after postnatal day 35 in the rat. We hypothesized that these changes in dopamine circuitry during the juvenile and adolescent periods would result in age-dependent differences in learning the predictive value of environmental cues. Using an odor-guided associative learning task, we found that adolescent rats learn the association between an odor and a palatable reward significantly more slowly than either juvenile or adult rats. Further, adolescent rats displayed greater distractibility during the task than either juvenile or adult rats. Using real-time quantitative PCR and immunohistochemical methods, we observed that the behavioral deficit in adolescence coincides with a significant increase in D1 dopamine receptor expression compared to juvenile rats in both the OFC and piriform cortex. Further, we found that both the slower learning and increased distractibility exhibited in adolescence could be alleviated by experience with the association task as a juvenile, or by an acute administration of a low dose of either the dopamine D1 receptor agonist SKF-38393 or the D2 receptor antagonist eticlopride. These results suggest that dopaminergic modulation of cortical function may be important for learning the predictive value of environmental stimuli, and that developmental changes in cortical dopaminergic circuitry may underlie age-related differences in associative learning. 相似文献

14.

Leptin receptor signaling in midbrain dopamine neurons regulates feeding

Hommel JD Trinko R Sears RM Georgescu D Liu ZW Gao XB Thurmon JJ Marinelli M DiLeone RJ 《Neuron》2006,51(6):801-810

The leptin hormone is critical for normal food intake and metabolism. While leptin receptor (Lepr) function has been well studied in the hypothalamus, the functional relevance of Lepr expression in the ventral tegmental area (VTA) has not been investigated. The VTA contains dopamine neurons that are important in modulating motivated behavior, addiction, and reward. Here, we show that VTA dopamine neurons express Lepr mRNA and respond to leptin with activation of an intracellular JAK-STAT pathway and a reduction in firing rate. Direct administration of leptin to the VTA caused decreased food intake while long-term RNAi-mediated knockdown of Lepr in the VTA led to increased food intake, locomotor activity, and sensitivity to highly palatable food. These data support a critical role for VTA Lepr in regulating feeding behavior and provide functional evidence for direct action of a peripheral metabolic signal on VTA dopamine neurons. 相似文献

15.

Dissociation between Active and Observational Learning from Positive and Negative Feedback in Parkinsonism

Stefan Kobza Stefano Ferrea Alfons Schnitzler Bettina Pollok Martin Südmeyer Christian Bellebaum 《PloS one》2012,7(11)

Feedback to both actively performed and observed behaviour allows adaptation of future actions. Positive feedback leads to increased activity of dopamine neurons in the substantia nigra, whereas dopamine neuron activity is decreased following negative feedback. Dopamine level reduction in unmedicated Parkinson’s Disease patients has been shown to lead to a negative learning bias, i.e. enhanced learning from negative feedback. Recent findings suggest that the neural mechanisms of active and observational learning from feedback might differ, with the striatum playing a less prominent role in observational learning. Therefore, it was hypothesized that unmedicated Parkinson’s Disease patients would show a negative learning bias only in active but not in observational learning. In a between-group design, 19 Parkinson’s Disease patients and 40 healthy controls engaged in either an active or an observational probabilistic feedback-learning task. For both tasks, transfer phases aimed to assess the bias to learn better from positive or negative feedback. As expected, actively learning patients showed a negative learning bias, whereas controls learned better from positive feedback. In contrast, no difference between patients and controls emerged for observational learning, with both groups showing better learning from positive feedback. These findings add to neural models of reinforcement-learning by suggesting that dopamine-modulated input to the striatum plays a minor role in observational learning from feedback. Future research will have to elucidate the specific neural underpinnings of observational learning. 相似文献

16.

A role for phasic dopamine neuron firing in habit learning

Aggarwal M Wickens JR 《Neuron》2011,72(6):892-894

In this issue of Neuron, Wang et?al. (2011) show that mice with dopamine neuron-specific NMDAR1 deletion have attenuated phasic dopamine neuron firing and a deficit in habit learning. These findings indicate that brain regions sensitive to phasic dopamine signals may underlie habit learning. 相似文献

17.

Dynamical systems for predictive control of autonomous robots

J. Michael Herrmann 《Theorie in den Biowissenschaften》2001,120(3-4):241-252

Summary Regularities in the environment are accessible to an autonomous agents as reproducible relations between actions and perceptions and can be exploited by unsupervised learning. Our approach is based on the possibility to perform and to verify predictions about perceivable consequences of actions. It is implemented as a three-layer neural network that combines predictive perception, internal-state transitions and action selection into a loop which closes via the environment. In addition to minimizing prediction errors, the goal of network adaptation comprises also an optimization of the minimization rate such that new behaviors are favored over already learned ones, which would result in a vanishing improvement of predictability. Previously learned behaviors are reactivated or continued if triggering stimuli are available and an externally or otherwise given reward overcompensates the decay of the learning rate. In the model, behavior learning and learning behavior are brought about by the same mechanism, namely the drive to continuously experience learning success. Behavior learning comprises representation and storage of learned behaviors and finally their inhibition such that a further exploration of the environment is possible. Learning behavior, in contrast, detects the frontiers of the manifold of learned behaviors and provides estimates of the learnability of behaviors leading outwards the field of expertise. The network module has been implemented in a Khepera miniature robot. We also consider hierarchical architectures consisting of several modules in one agent as well as groups of several agents, which are controlled by such networks. 相似文献

18.

Models of heterogeneous dopamine signaling in an insect learning and memory center

Linnie Jiang Ashok Litwin-Kumar 《PLoS computational biology》2021,17(8)

The Drosophila mushroom body exhibits dopamine dependent synaptic plasticity that underlies the acquisition of associative memories. Recordings of dopamine neurons in this system have identified signals related to external reinforcement such as reward and punishment. However, other factors including locomotion, novelty, reward expectation, and internal state have also recently been shown to modulate dopamine neurons. This heterogeneity is at odds with typical modeling approaches in which these neurons are assumed to encode a global, scalar error signal. How is dopamine dependent plasticity coordinated in the presence of such heterogeneity? We develop a modeling approach that infers a pattern of dopamine activity sufficient to solve defined behavioral tasks, given architectural constraints informed by knowledge of mushroom body circuitry. Model dopamine neurons exhibit diverse tuning to task parameters while nonetheless producing coherent learned behaviors. Notably, reward prediction error emerges as a mode of population activity distributed across these neurons. Our results provide a mechanistic framework that accounts for the heterogeneity of dopamine activity during learning and behavior. 相似文献

19.

Changes in striatal dopamine release associated with human motor-skill acquisition

Kawashima S Ueki Y Kato T Matsukawa N Mima T Hallett M Ito K Ojika K 《PloS one》2012,7(2):e31728

The acquisition of new motor skills is essential throughout daily life and involves the processes of learning new motor sequence and encoding elementary aspects of new movement. Although previous animal studies have suggested a functional importance for striatal dopamine release in the learning of new motor sequence, its role in encoding elementary aspects of new movement has not yet been investigated. To elucidate this, we investigated changes in striatal dopamine levels during initial skill-training (Day 1) compared with acquired conditions (Day 2) using (11)C-raclopride positron-emission tomography. Ten volunteers learned to perform brisk contractions using their non-dominant left thumbs with the aid of visual feedback. On Day 1, the mean acceleration of each session was improved through repeated training sessions until performance neared asymptotic levels, while improved motor performance was retained from the beginning on Day 2. The (11)C-raclopride binding potential (BP) in the right putamen was reduced during initial skill-training compared with under acquired conditions. Moreover, voxel-wise analysis revealed that (11)C-raclopride BP was particularly reduced in the right antero-dorsal to the lateral part of the putamen. Based on findings from previous fMRI studies that show a gradual shift of activation within the striatum during the initial processing of motor learning, striatal dopamine may play a role in the dynamic cortico-striatal activation during encoding of new motor memory in skill acquisition. 相似文献

20.

Pavlovian conditioning in Hermissenda: a circuit analysis

Crow T Tian LM 《The Biological bulletin》2006,210(3):289-297

相似文献