期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Effects of subclinical depression on prefrontal–striatal model-based and model-free learning

Suyeon Heo Yoondo Sung Sang Wan Lee 《PLoS computational biology》2021,17(5)

Depression is characterized by deficits in the reinforcement learning (RL) process. Although many computational and neural studies have extended our knowledge of the impact of depression on RL, most focus on habitual control (model-free RL), yielding a relatively poor understanding of goal-directed control (model-based RL) and arbitration control to find a balance between the two. We investigated the effects of subclinical depression on model-based and model-free learning in the prefrontal–striatal circuitry. First, we found that subclinical depression is associated with the attenuated state and reward prediction error representation in the insula and caudate. Critically, we found that it accompanies the disrupted arbitration control between model-based and model-free learning in the predominantly inferior lateral prefrontal cortex and frontopolar cortex. We also found that depression undermines the ability to exploit viable options, called exploitation sensitivity. These findings characterize how subclinical depression influences different levels of the decision-making hierarchy, advancing previous conflicting views that depression simply influences either habitual or goal-directed control. Our study creates possibilities for various clinical applications, such as early diagnosis and behavioral therapy design. 相似文献

2.

Model-Based Reasoning in Humans Becomes Automatic with Training

Marcos Economides Zeb Kurth-Nelson Annika Lübbert Marc Guitart-Masip Raymond J. Dolan 《PLoS computational biology》2015,11(9)

Model-based and model-free reinforcement learning (RL) have been suggested as algorithmic realizations of goal-directed and habitual action strategies. Model-based RL is more flexible than model-free but requires sophisticated calculations using a learnt model of the world. This has led model-based RL to be identified with slow, deliberative processing, and model-free RL with fast, automatic processing. In support of this distinction, it has recently been shown that model-based reasoning is impaired by placing subjects under cognitive load—a hallmark of non-automaticity. Here, using the same task, we show that cognitive load does not impair model-based reasoning if subjects receive prior training on the task. This finding is replicated across two studies and a variety of analysis methods. Thus, task familiarity permits use of model-based reasoning in parallel with other cognitive demands. The ability to deploy model-based reasoning in an automatic, parallelizable fashion has widespread theoretical implications, particularly for the learning and execution of complex behaviors. It also suggests a range of important failure modes in psychiatric disorders. 相似文献

3.

Spontaneous decisions and operant conditioning in fruit flies

Brembs B 《Behavioural processes》2011,87(1):157-164

Already in the 1930s Skinner, Konorskiand colleagues debated the commonalities, differences and interactions among the processes underlying what was then known as “conditioned reflexes type I and II”, but which is today more well-known as classical (Pavlovian) and operant (instrumental) conditioning. Subsequent decades of research have confirmed that the interactions between the various learning systems engaged during operant conditioning are complex and difficult to disentangle. Today, modern neurobiological tools allow us to dissect the biological processes underlying operant conditioning and study their interactions. These processes include initiating spontaneous behavioral variability, world-learning and self-learning. The data suggest that behavioral variability is generated actively by the brain, rather than as a by-product of a complex, noisy input-output system. The function of this variability, in part, is to detect how the environment responds to such actions. World-learning denotes the biological process by which value is assigned to environmental stimuli. Self-learning is the biological process which assigns value to a specific action or movement. In an operant learning situation using visual stimuli for flies, world-learning inhibits self-learning via a prominent neuropil region, the mushroom-bodies. Only extended training can overcome this inhibition and lead to habit formation by engaging the self-learning mechanism. Self-learning transforms spontaneous, flexible actions into stereotyped, habitual responses. 相似文献

4.

Dopamine Enhances Model-Based over Model-Free Choice Behavior

K Wunderlich P Smittenaar RJ Dolan 《Neuron》2012,75(3):418-424

Decision making is often considered to arise out of contributions from a model-free habitual system and a model-based goal-directed system. Here, we investigated the effect of a dopamine manipulation on the degree to which either system contributes to instrumental behavior in a two-stage Markov decision task, which has been shown to discriminate model-free from model-based control. We found increased dopamine levels promote model-based over model-free choice. 相似文献

5.

The role of the basal ganglia in habit formation

Yin HH Knowlton BJ 《Nature reviews. Neuroscience》2006,7(6):464-476

Many organisms, especially humans, are characterized by their capacity for intentional, goal-directed actions. However, similar behaviours often proceed automatically, as habitual responses to antecedent stimuli. How are goal-directed actions transformed into habitual responses? Recent work combining modern behavioural assays and neurobiological analysis of the basal ganglia has begun to yield insights into the neural basis of habit formation. 相似文献

6.

Impairments in Goal-Directed Actions Predict Treatment Response to Cognitive-Behavioral Therapy in Social Anxiety Disorder

Gail A. Alvares Bernard W. Balleine Adam J. Guastella 《PloS one》2014,9(4)

Social anxiety disorder is characterized by excessive fear and habitual avoidance of social situations. Decision-making models suggest that patients with anxiety disorders may fail to exhibit goal-directed control over actions. We therefore investigated whether such biases may also be associated with social anxiety and to examine the relationship between such behavior with outcomes from cognitive-behavioral therapy. Patients diagnosed with social anxiety and controls completed an instrumental learning task in which two actions were performed to earn food outcomes. After outcome devaluation, where one outcome was consumed to satiety, participants were re-tested in extinction. Results indicated that, as expected, controls were goal-directed, selectively reducing responding on the action that previously delivered the devalued outcome. Patients with social anxiety, however, exhibited no difference in responding on either action. This loss of a devaluation effect was associated with greater symptom severity and poorer response to therapy. These findings indicate that variations in goal-directed control in social anxiety may represent both a behavioral endophenotype and may be used to predict individuals who will respond to learning-based therapies. 相似文献

7.

Imbalanced Decision Hierarchy in Addicts Emerging from Drug-Hijacked Dopamine Spiraling Circuit

Mehdi Keramati Boris Gutkin 《PloS one》2013,8(4)

Despite explicitly wanting to quit, long-term addicts find themselves powerless to resist drugs, despite knowing that drug-taking may be a harmful course of action. Such inconsistency between the explicit knowledge of negative consequences and the compulsive behavioral patterns represents a cognitive/behavioral conflict that is a central characteristic of addiction. Neurobiologically, differential cue-induced activity in distinct striatal subregions, as well as the dopamine connectivity spiraling from ventral striatal regions to the dorsal regions, play critical roles in compulsive drug seeking. However, the functional mechanism that integrates these neuropharmacological observations with the above-mentioned cognitive/behavioral conflict is unknown. Here we provide a formal computational explanation for the drug-induced cognitive inconsistency that is apparent in the addicts'' “self-described mistake”. We show that addictive drugs gradually produce a motivational bias toward drug-seeking at low-level habitual decision processes, despite the low abstract cognitive valuation of this behavior. This pathology emerges within the hierarchical reinforcement learning framework when chronic exposure to the drug pharmacologically produces pathologicaly persistent phasic dopamine signals. Thereby the drug hijacks the dopaminergic spirals that cascade the reinforcement signals down the ventro-dorsal cortico-striatal hierarchy. Neurobiologically, our theory accounts for rapid development of drug cue-elicited dopamine efflux in the ventral striatum and a delayed response in the dorsal striatum. Our theory also shows how this response pattern depends critically on the dopamine spiraling circuitry. Behaviorally, our framework explains gradual insensitivity of drug-seeking to drug-associated punishments, the blocking phenomenon for drug outcomes, and the persistent preference for drugs over natural rewards by addicts. The model suggests testable predictions and beyond that, sets the stage for a view of addiction as a pathology of hierarchical decision-making processes. This view is complementary to the traditional interpretation of addiction as interaction between habitual and goal-directed decision systems. 相似文献

8.

The Effect of Ratio and Interval Training on Pavlovian-Instrumental Transfer in Mice

Brian J. Wiltgen Courtney Sinclair Chadrick Lane Frank Barrows Martín Molina Chloe Chabanon-Hicks 《PloS one》2012,7(10)

Conditional stimuli (CS) that are paired with reward can be used to motivate instrumental responses. This process is called Pavlovian-instrumental transfer (PIT). A recent study in rats suggested that habitual responses are particularly sensitive to the motivational effects of reward cues. The current experiments examined this idea using ratio and interval training in mice. Two groups of animals were trained to lever press for food pellets that were delivered on random ratio or random interval schedules. Devaluation tests revealed that interval training led to habitual responding while ratio training produced goal-directed actions. The presentation of CSs paired with reward led to positive transfer in both groups, however, the size of this effect was much larger in mice that were trained on interval schedules. This result suggests that habitual responses are more sensitive to the motivational influence of reward cues than goal-directed actions. The implications for neurobiological models of motivation and drug seeking behaviors are discussed. 相似文献

9.

Evidence for Habitual and Goal-Directed Behavior Following Devaluation of Cocaine: A Multifaceted Interpretation of Relapse

David H. Root Anthony T. Fabbricatore David J. Barker Sisi Ma Anthony P. Pawlak Mark O. West 《PloS one》2009,4(9)

Background

Cocaine addiction is characterized as a chronically relapsing disorder. It is believed that cues present during self-administration become learned and increase the probability that relapse will occur when they are confronted during abstinence. However, the way in which relapse-inducing cues are interpreted by the user has remained elusive. Recent theories of addiction posit that relapse-inducing cues cause relapse habitually or automatically, bypassing processing information related to the consequences of relapse. Alternatively, other theories hypothesize that relapse-inducing cues produce an expectation of the drug''s consequences, designated as goal-directed relapse. Discrete discriminative stimuli signaling the availability of cocaine produce robust cue-induced responding after thirty days of abstinence. However, it is not known whether cue-induced responding is a goal-directed action or habit.

Methodology/Principal Findings

We tested whether cue-induced responding is a goal-directed action or habit by explicitly pairing or unpairing cocaine with LiCl-induced sickness (n = 7/group), thereby decreasing or not altering the value of cocaine, respectively. Following thirty days of abstinence, no difference in responding between groups was found when animals were reintroduced to the self-administration environment alone, indicating habitual behavior. However, upon discriminative stimulus presentations, cocaine-sickness paired animals exhibited decreased cue-induced responding relative to unpaired controls, indicating goal-directed behavior. In spite of the difference between groups revealed during abstinent testing, no differences were found between groups when animals were under the influence of cocaine.

Conclusions/Significance

Unexpectedly, both habitual and goal-directed responding occurred during abstinent testing. Furthermore, habitual or goal-directed responding may have been induced by cues that differed in their correlation with the cocaine infusion. Non-discriminative stimulus cues were weak correlates of the infusion, which failed to evoke a representation of the value of cocaine and led to habitual behavior. However, the discriminative stimulus–nearly perfectly correlated with the infusion–likely evoked a representation of the value of the infusion and led to goal-directed behavior. These data indicate that abstinent cue-induced responding is multifaceted, dynamically engendering habitual or goal-directed behavior. Moreover, since goal-directed behavior terminated habitual behavior during testing, therapeutic approaches aimed at reducing the perceived value of cocaine in addicted individuals may reduce the capacity of cues to induce relapse. 相似文献

10.

Humans but Not Chimpanzees Vary Face-Scanning Patterns Depending on Contexts during Action Observation

Masako Myowa-Yamakoshi Chisato Yoshida Satoshi Hirata 《PloS one》2015,10(11)

Human and nonhuman primates comprehend the actions of other individuals by detecting social cues, including others’ goal-directed motor actions and faces. However, little is known about how this information is integrated with action understanding. Here, we present the ontogenetic and evolutionary foundations of this capacity by comparing face-scanning patterns of chimpanzees and humans as they viewed goal-directed human actions within contexts that differ in whether or not the predicted goal is achieved. Human adults and children attend to the actor’s face during action sequences, and this tendency is particularly pronounced in adults when observing that the predicted goal is not achieved. Chimpanzees rarely attend to the actor’s face during the goal-directed action, regardless of whether the predicted action goal is achieved or not. These results suggest that in humans, but not chimpanzees, attention to actor’s faces conveying referential information toward the target object indicates the process of observers making inferences about the intentionality of an action. Furthermore, this remarkable predisposition to observe others’ actions by integrating the prediction of action goals and the actor’s intention is developmentally acquired. 相似文献

11.

Goal-directed and habitual control in the basal ganglia: implications for Parkinson's disease

Redgrave P Rodriguez M Smith Y Rodriguez-Oroz MC Lehericy S Bergman H Agid Y DeLong MR Obeso JA 《Nature reviews. Neuroscience》2010,11(11):760-772

Progressive loss of the ascending dopaminergic projection in the basal ganglia is a fundamental pathological feature of Parkinson's disease. Studies in animals and humans have identified spatially segregated functional territories in the basal ganglia for the control of goal-directed and habitual actions. In patients with Parkinson's disease the loss of dopamine is predominantly in the posterior putamen, a region of the basal ganglia associated with the control of habitual behaviour. These patients may therefore be forced into a progressive reliance on the goal-directed mode of action control that is mediated by comparatively preserved processing in the rostromedial striatum. Thus, many of their behavioural difficulties may reflect a loss of normal automatic control owing to distorting output signals from habitual control circuits, which impede the expression of goal-directed action. 相似文献

12.

Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making

He A. Xu Alireza Modirshanechi Marco P. Lehmann Wulfram Gerstner Michael H. Herzog 《PLoS computational biology》2021,17(6)

Classic reinforcement learning (RL) theories cannot explain human behavior in the absence of external reward or when the environment changes. Here, we employ a deep sequential decision-making paradigm with sparse reward and abrupt environmental changes. To explain the behavior of human participants in these environments, we show that RL theories need to include surprise and novelty, each with a distinct role. While novelty drives exploration before the first encounter of a reward, surprise increases the rate of learning of a world-model as well as of model-free action-values. Even though the world-model is available for model-based RL, we find that human decisions are dominated by model-free action choices. The world-model is only marginally used for planning, but it is important to detect surprising events. Our theory predicts human action choices with high probability and allows us to dissociate surprise, novelty, and reward in EEG signals. 相似文献

13.

Mushroom Bodies Regulate Habit Formation in Drosophila

Bjrn Brembs 《Current biology : CB》2009,19(16):1351-1355

To make good decisions, we evaluate past choices to guide later decisions. In most situations, we have the opportunity to simultaneously learn about both the consequences of our choice (i.e., operantly) and the stimuli associated with correct or incorrect choices (i.e., classically) [1]. Interestingly, in many species, including humans, these learning processes occasionally lead to irrational decisions [2]. An extreme case is the habitual drug user consistently administering the drug despite the negative consequences, but we all have experience with our own, less severe habits. The standard animal model employs a combination of operant and classical learning components to bring about habit formation in rodents [3] and [4]. After extended training, these animals will press a lever even if the outcome associated with lever-pressing is no longer desired [5]. In this study, experiments with wild-type and transgenic flies revealed that a prominent insect neuropil, the mushroom bodies (MBs), regulates habit formation in flies by inhibiting the operant learning system when a predictive stimulus is present. This inhibition enables generalization of the classical memory and prevents premature habit formation. Extended training in wild-type flies produced a phenocopy of MB-impaired flies, such that generalization was abolished and goal-directed actions were transformed into habitual responses. 相似文献

14.

Repeated Cocaine Exposure Facilitates the Expression of Incentive Motivation and Induces Habitual Control in Rats

Kimberly H. LeBlanc Nigel T. Maidment Sean B. Ostlund 《PloS one》2013,8(4)

There is growing evidence that mere exposure to drugs can induce long-term alterations in the neural systems that mediate reward processing, motivation, and behavioral control, potentially causing the pathological pursuit of drugs that characterizes the addicted state. The incentive sensitization theory proposes that drug exposure potentiates the influence of reward-paired cues on behavior. It has also been suggested that drug exposure biases action selection towards the automatic execution of habits and away from more deliberate goal-directed control. The current study investigated whether rats given repeated exposure to peripherally administered cocaine would show alterations in incentive motivation (assayed using the Pavlovian-to-instrumental transfer (PIT) paradigm) or habit formation (assayed using sensitivity to reward devaluation). After instrumental and Pavlovian training for food pellet rewards, rats were given 6 daily injections of cocaine (15 mg/kg, IP) or saline, followed by a 10-d period of rest. Consistent with the incentive sensitization theory, cocaine-treated rats showed stronger cue-evoked lever pressing than saline-treated rats during the PIT test. The same rats were then trained on a new instrumental action with a new food pellet reward before undergoing a reward devaluation testing. Although saline-treated rats exhibited sensitivity to reward devaluation, indicative of goal-directed performance, cocaine-treated rats were insensitive to this treatment, suggesting a reliance on habitual processes. These findings, when taken together, indicate that repeated exposure to cocaine can cause broad alterations in behavioral control, spanning both motivational and action selection processes, and could therefore help explain aberrations of decision-making that underlie drug addiction. 相似文献

15.

Dynamic Changes in Single Unit Activity and Gamma Oscillations in a Thalamocortical Circuit during Rapid Instrumental Learning

Chunxiu Yu David Fan Alberto Lopez Henry H. Yin 《PloS one》2012,7(11)

The medial prefrontal cortex (mPFC) and mediodorsal thalamus (MD) together form a thalamocortical circuit that has been implicated in the learning and production of goal-directed actions. In this study we measured neural activity in both regions simultaneously, as rats learned to press a lever to earn food rewards. In both MD and mPFC, instrumental learning was accompanied by dramatic changes in the firing patterns of the neurons, in particular the rapid emergence of single-unit neural activity reflecting the completion of the action and reward delivery. In addition, we observed distinct patterns of changes in the oscillatory LFP response in MD and mPFC. With learning, there was a significant increase in theta band oscillations (6–10 Hz) in the MD, but not in the mPFC. By contrast, gamma band oscillations (40–55 Hz) increased in the mPFC, but not in the MD. Coherence between these two regions also changed with learning: gamma coherence in relation to reward delivery increased, whereas theta coherence did not. Together these results suggest that, as rats learned the instrumental contingency between action and outcome, the emergence of task related neural activity is accompanied by enhanced functional interaction between MD and mPFC in response to the reward feedback. 相似文献

16.

Making Smart Social Judgments Takes Time: Infants' Recruitment of Goal Information When Generating Action Predictions

Sheila Krogh-Jespersen Amanda L. Woodward 《PloS one》2014,9(5)

Previous research has shown that young infants perceive others'' actions as structured by goals. One open question is whether the recruitment of this understanding when predicting others'' actions imposes a cognitive challenge for young infants. The current study explored infants'' ability to utilize their knowledge of others'' goals to rapidly predict future behavior in complex social environments and distinguish goal-directed actions from other kinds of movements. Fifteen-month-olds (N = 40) viewed videos of an actor engaged in either a goal-directed (grasping) or an ambiguous (brushing the back of her hand) action on a Tobii eye-tracker. At test, critical elements of the scene were changed and infants'' predictive fixations were examined to determine whether they relied on goal information to anticipate the actor''s future behavior. Results revealed that infants reliably generated goal-based visual predictions for the grasping action, but not for the back-of-hand behavior. Moreover, response latencies were longer for goal-based predictions than for location-based predictions, suggesting that goal-based predictions are cognitively taxing. Analyses of areas of interest indicated that heightened attention to the overall scene, as opposed to specific patterns of attention, was the critical indicator of successful judgments regarding an actor''s future goal-directed behavior. These findings shed light on the processes that support “smart” social behavior in infants, as it may be a challenge for young infants to use information about others'' intentions to inform rapid predictions. 相似文献

17.

Conditioning and sexual behavior: a review

Pfaus JG Kippin TE Centeno S 《Hormones and behavior》2001,40(2):291-321

Sexual behavior is directed by a sophisticated interplay between steroid hormone actions in the brain that give rise to sexual arousability and experience with sexual reward that gives rise to expectations of competent sexual activity, sexual desire, arousal, and performance. Sexual experience allows animals to form instrumental associations between internal or external stimuli and behaviors that lead to different sexual rewards. Furthermore, Pavlovian associations between internal and external stimuli allow animals to predict sexual outcomes. These two types of learning build upon instinctual mechanisms to create distinctive, and seemingly "automated," patterns of sexual response. This article reviews the literature on conditioning and sexual behavior with a particular emphasis on incentive sequences of sexual behavior that move animals from distal to proximal with regard to sexual stimuli during appetitive phases of behavior and ultimately result in copulatory interaction and mating during consummatory phases of behavior. Accordingly, the role of learning in sexual excitement, in behaviors that bring about the opportunity to mate, in courtship and solicitation displays, in sexual arousal and copulatory behaviors, in sexual partner preferences, and the short- and long-term influence of copulatory experience on sexual and reproductive function is examined. Although hormone actions set the stage for sexual activity by generating the ability of animals to become sexually excited and aroused, it is each animal's unique experience with sexual behavior and sexual reward that molds the strength of responses made toward sexual incentives. 相似文献

18.

Ethanol seeking by long evans rats is not always a goal-directed behavior

RA Mangieri RU Cofresí RA Gonzales 《PloS one》2012,7(8):e42886

Background

Two parallel and interacting processes are said to underlie animal behavior, whereby learning and performance of a behavior is at first via conscious and deliberate (goal-directed) processes, but after initial acquisition, the behavior can become automatic and stimulus-elicited (habitual). With respect to instrumental behaviors, animal learning studies suggest that the duration of training and the action-outcome contingency are two factors involved in the emergence of habitual seeking of “natural” reinforcers (e.g., sweet solutions, food or sucrose pellets). To rigorously test whether behaviors reinforced by abused substances such as ethanol, in particular, similarly become habitual was the primary aim of this study.

Methodology/Principal Findings

Male Long Evans rats underwent extended or limited operant lever press training with 10% sucrose/10% ethanol (10S10E) reinforcement (variable interval (VI) or (VR) ratio schedule of reinforcement), or with 10% sucrose (10S) reinforcement (VI schedule only). Once training and pretesting were complete, the impact of outcome devaluation on operant behavior was evaluated after lithium chloride injections were paired with the reinforcer, or unpaired 24 hours later. After limited, but not extended instrumental training, lever pressing by groups trained under VR with 10S10E and under VI with 10S was sensitive to outcome devaluation. In contrast, responding by both the extended and limited training 10S10E VI groups was not sensitive to ethanol devaluation during the test for habitual behavior.

Conclusions/Significance

Operant behavior by rats trained to self-administer an ethanol-sucrose solution showed variable sensitivity to a change in the value of ethanol, with relative insensitivity developing sooner in animals that received time-variable ethanol reinforcement during training sessions. One important implication, with respect to substance abuse in humans, is that initial learning about the relationship between instrumental actions and the opportunity to consume ethanol-containing drinks can influence the time course for the development or expression of habitual ethanol seeking behavior. 相似文献

19.

Instrumental Learning in Neodecorticate Rabbits

DAVID A. OAKLEY 《Nature: New biology》1971,233(40):185-187

ALTHOUGH subtotal neocortical lesions seem not to impair an animal's ability to acquire a new habit in classical (Pav-lovian) conditioning procedures^1,2, instrumental learning is retarded by this surgical, procedure in proportion to the mass of tissue removed^3–6. Little is known, however, about an animal's ability to benefit from formal training procedures if the entire neocortex is removed. Earlier experiments have shown that a decorticate can acquire simple salivary^7,8, leg-flexion⁹, or diffuse^10,11 Pavlovian conditional responses and Bromiley¹² has reported a restrained, decorticate dog which produced leg flexions to avoid shock, although only in favourable conditions. A more recent study¹³, involving rats with 90% ablations of neocortex, showed that Pavlovian autonomic conditioning was little affected by cortical lesions which abolished instrumental learning of the same responses. I have investigated the possibility of establishing the instrumental response of lever pressing for food in freely moving, totally neodecorticated rabbits in conditions of prolonged training. 相似文献

20.

Objects Mediate Goal Integration in Ventrolateral Prefrontal Cortex during Action Observation

Mari Hrka? Moritz F. Wurm Anne B. Kühn Ricarda I. Schubotz 《PloS one》2015,10(7)

相似文献