首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that has perfect knowledge of the model of how rewards and events are generated in the environment. Rather than being suboptimal, we argue that the learning problem humans face is more complex, in that it also involves learning the structure of reward generation in the environment. We formulate the problem of structure learning in sequential decision tasks using Bayesian reinforcement learning, and show that learning the generative model for rewards qualitatively changes the behavior of an optimal learning agent. To test whether people exhibit structure learning, we performed experiments involving a mixture of one-armed and two-armed bandit reward models, where structure learning produces many of the qualitative behaviors deemed suboptimal in previous studies. Our results demonstrate humans can perform structure learning in a near-optimal manner.  相似文献   

2.
Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents—an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.  相似文献   

3.
The skills required for the learning and use of language are the focus of extensive research, and their evolutionary origins are widely debated. Using agent-based simulations in a range of virtual environments, we demonstrate that challenges of foraging for food can select for cognitive mechanisms supporting complex, hierarchical, sequential learning, the need for which arises in language acquisition. Building on previous work, where we explored the conditions under which reinforcement learning is out-competed by seldom-reinforced continuous learning that constructs a network model of the environment, we now show that realistic features of the foraging environment can select for two critical advances: (i) chunking of meaningful sequences found in the data, leading to representations composed of units that better fit the prevalent statistical patterns in the environment; and (ii) generalization across units based on their contextual similarity. Importantly, these learning processes, which in our framework evolved for making better foraging decisions, had been earlier shown to reproduce a range of findings in language learning in humans. Thus, our results suggest a possible evolutionary trajectory that may have led from basic learning mechanisms to complex hierarchical sequential learning that can support advanced cognitive abilities of the kind needed for language acquisition.  相似文献   

4.
Many decisions in life are sequential and constrained by a time window. Although mathematically derived optimal solutions exist, it has been reported that humans often deviate from making optimal choices. Here, we used a secretary problem, a classic example of finite sequential decision-making, and investigated the mechanisms underlying individuals’ suboptimal choices. Across three independent experiments, we found that a dynamic programming model comprising subjective value function explains individuals’ deviations from optimality and predicts the choice behaviors under fewer and more opportunities. We further identified that pupil dilation reflected the levels of decision difficulty and subsequent choices to accept or reject the stimulus at each opportunity. The value sensitivity, a model-based estimate that characterizes each individual’s subjective valuation, correlated with the extent to which individuals’ physiological responses tracked stimuli information. Our results provide model-based and physiological evidence for subjective valuation in finite sequential decision-making, rediscovering human suboptimality in subjectively optimal decision-making processes.  相似文献   

5.
How do we use our memories of the past to guide decisions we''ve never had to make before? Although extensive work describes how the brain learns to repeat rewarded actions, decisions can also be influenced by associations between stimuli or events not directly involving reward — such as when planning routes using a cognitive map or chess moves using predicted countermoves — and these sorts of associations are critical when deciding among novel options. This process is known as model-based decision making. While the learning of environmental relations that might support model-based decisions is well studied, and separately this sort of information has been inferred to impact decisions, there is little evidence concerning the full cycle by which such associations are acquired and drive choices. Of particular interest is whether decisions are directly supported by the same mnemonic systems characterized for relational learning more generally, or instead rely on other, specialized representations. Here, building on our previous work, which isolated dual representations underlying sequential predictive learning, we directly demonstrate that one such representation, encoded by the hippocampal memory system and adjacent cortical structures, supports goal-directed decisions. Using interleaved learning and decision tasks, we monitor predictive learning directly and also trace its influence on decisions for reward. We quantitatively compare the learning processes underlying multiple behavioral and fMRI observables using computational model fits. Across both tasks, a quantitatively consistent learning process explains reaction times, choices, and both expectation- and surprise-related neural activity. The same hippocampal and ventral stream regions engaged in anticipating stimuli during learning are also engaged in proportion to the difficulty of decisions. These results support a role for predictive associations learned by the hippocampal memory system to be recalled during choice formation.  相似文献   

6.
Oxidative damage in the brain may lead to cognitive impairments in aged humans. Further, in age-associated neurodegenerative disease, oxidative damage may be exacerbated and associated with additional neuropathology. Epidemiological studies in humans show both positive and negative effects of the use of antioxidant supplements on healthy cognitive aging and on the risk of developing Alzheimer disease (AD). This contrasts with consistent behavioral improvements in aged rodent models. In a higher mammalian model system that naturally accumulates human-type pathology and cognitive decline (aged dogs), an antioxidant enriched diet leads to rapid learning improvements, memory improvements after prolonged treatment and cognitive maintenance. Cognitive benefits can be further enhanced by the addition of behavioral enrichment. In the brains of aged treated dogs, oxidative damage is reduced and there is some evidence of reduced AD-like neuropathology. In combination, antioxidants may be beneficial for promoting healthy brain aging and reducing the risk of neurodegenerative disease. Special issue article in honor of Dr. Akitne Mori.  相似文献   

7.
In learning from trial and error, animals need to relate behavioral decisions to environmental reinforcement even though it may be difficult to assign credit to a particular decision when outcomes are uncertain or subject to delays. When considering the biophysical basis of learning, the credit-assignment problem is compounded because the behavioral decisions themselves result from the spatio-temporal aggregation of many synaptic releases. We present a model of plasticity induction for reinforcement learning in a population of leaky integrate and fire neurons which is based on a cascade of synaptic memory traces. Each synaptic cascade correlates presynaptic input first with postsynaptic events, next with the behavioral decisions and finally with external reinforcement. For operant conditioning, learning succeeds even when reinforcement is delivered with a delay so large that temporal contiguity between decision and pertinent reward is lost due to intervening decisions which are themselves subject to delayed reinforcement. This shows that the model provides a viable mechanism for temporal credit assignment. Further, learning speeds up with increasing population size, so the plasticity cascade simultaneously addresses the spatial problem of assigning credit to synapses in different population neurons. Simulations on other tasks, such as sequential decision making, serve to contrast the performance of the proposed scheme to that of temporal difference-based learning. We argue that, due to their comparative robustness, synaptic plasticity cascades are attractive basic models of reinforcement learning in the brain.  相似文献   

8.
Recruitment via pheromone trails by ants is arguably one of the best-studied examples of self-organization in animal societies. Yet it is still unclear if and how trail recruitment allows a colony to adapt to changes in its foraging environment. We study foraging decisions by colonies of the ant Pheidole megacephala under dynamic conditions. Our experiments show that P. megacephala, unlike many other mass recruiting species, can make a collective decision for the better of two food sources even when the environment changes dynamically. We developed a stochastic differential equation model that explains our data qualitatively and quantitatively. Analysing this model reveals that both deterministic and stochastic effects (noise) work together to allow colonies to efficiently track changes in the environment. Our study thus suggests that a certain level of noise is not a disturbance in self-organized decision-making but rather serves an important functional role.  相似文献   

9.
Alzheimer's disease is the most common neurodegenerative disease. The aim of this study is to infer structural changes in brain connectivity resulting from disease progression using cortical thickness measurements from a cohort of participants who were either healthy control, or with mild cognitive impairment, or Alzheimer's disease patients. For this purpose, we develop a novel approach for inference of multiple networks with related edge values across groups. Specifically, we infer a Gaussian graphical model for each group within a joint framework, where we rely on Bayesian hierarchical priors to link the precision matrix entries across groups. Our proposal differs from existing approaches in that it flexibly learns which groups have the most similar edge values, and accounts for the strength of connection (rather than only edge presence or absence) when sharing information across groups. Our results identify key alterations in structural connectivity that may reflect disruptions to the healthy brain, such as decreased connectivity within the occipital lobe with increasing disease severity. We also illustrate the proposed method through simulations, where we demonstrate its performance in structure learning and precision matrix estimation with respect to alternative approaches.  相似文献   

10.
Loss of brain function is one of the most negative and feared aspects of aging. Studies of invertebrates have taught us much about the physiology of aging and how this progression may be slowed. Yet, how aging affects complex brain functions, e.g., the ability to acquire new memory when previous experience is no longer valid, is an almost exclusive question of studies in humans and mammalian models. In these systems, age related cognitive disorders are assessed through composite paradigms that test different performance tasks in the same individual. Such studies could demonstrate that afflicted individuals show the loss of several and often-diverse memory faculties, and that performance usually varies more between aged individuals, as compared to conspecifics from younger groups. No comparable composite surveying approaches are established yet for invertebrate models in aging research. Here we test whether an insect can share patterns of decline similar to those that are commonly observed during mammalian brain aging. Using honey bees, we combine restrained learning with free-flight assays. We demonstrate that reduced olfactory learning performance correlates with a reduced ability to extinguish the spatial memory of an abandoned nest location (spatial memory extinction). Adding to this, we show that learning performance is more variable in old honey bees. Taken together, our findings point to generic features of brain aging and provide the prerequisites to model individual aspects of learning dysfunction with insect models.  相似文献   

11.
In this article, we use game theory to understand the emergence of various kinds of territorial arrangements in the Maine lobster fishery during the past century. Using the Nash equilibria of models of the fishery as our theoretical framework, we show that informal territorial arrangements in this fishery went through three sequential stages. These stages are the result of decisions by groups of lobster fishermen to defend fishing areas or invade those of other groups. A large number of factors influence these defensive and offensive strategies: concentrations of lobsters, the adoption of better technology, transportation costs, ecological changes, trap monitoring costs, the ability to organize defensive and offensive groups, and better law enforcement---all of which are captured by crucial parameters of our model. We argue that this technique can be applied to elucidate territorial changes more generally.  相似文献   

12.
The spatial distribution of visual items allows us to infer the presence of latent causes in the world. For instance, a spatial cluster of ants allows us to infer the presence of a common food source. However, optimal inference requires the integration of a computationally intractable number of world states in real world situations. For example, optimal inference about whether a common cause exists based on N spatially distributed visual items requires marginalizing over both the location of the latent cause and 2N possible affiliation patterns (where each item may be affiliated or non-affiliated with the latent cause). How might the brain approximate this inference? We show that subject behaviour deviates qualitatively from Bayes-optimal, in particular showing an unexpected positive effect of N (the number of visual items) on the false-alarm rate. We propose several “point-estimating” observer models that fit subject behaviour better than the Bayesian model. They each avoid a costly computational marginalization over at least one of the variables of the generative model by “committing” to a point estimate of at least one of the two generative model variables. These findings suggest that the brain may implement partially committal variants of Bayesian models when detecting latent causes based on complex real world data.  相似文献   

13.
Since the world consists of objects that stimulate multiple senses, it is advantageous for a vertebrate to integrate all the sensory information available. However, the precise mechanisms governing the temporal dynamics of multisensory processing are not well understood. We develop a computational modeling approach to investigate these mechanisms. We present an oscillatory neural network model for multisensory learning based on sparse spatio-temporal encoding. Recently published results in cognitive science show that multisensory integration produces greater and more efficient learning. We apply our computational model to qualitatively replicate these results. We vary learning protocols and system dynamics, and measure the rate at which our model learns to distinguish superposed presentations of multisensory objects. We show that the use of multiple channels accelerates learning and recall by up to 80%. When a sensory channel becomes disabled, the performance degradation is less than that experienced during the presentation of non-congruent stimuli. This research furthers our understanding of fundamental brain processes, paving the way for multiple advances including the building of machines with more human-like capabilities.  相似文献   

14.
To analyze the relationship of disruptions in neurocognitive decision-making mechanisms based on logic and reasoning, or in a situation of uncertainty based on emotional experience (emotional learning) with clinical indices of depression, a multidisciplinary clinical, psychological, and neurophysiological study was conducted in 28 patients suffering from depression (women aged 18–56) and 50 healthy volunteers (women aged 18–55). The intensity of depression was estimated quantitatively by the Hamilton’s Depression Rating Scale (HDRS-17) to qualitatively estimate cognitive functions, the “Ten Words” technique, the Wisconsin Card-Sorting Test (WCST), and the Iowa Gambling Task (IGT) were used; and to assess the functional brain state of all patients suffering from depression, a multichannel recording of the background electroencephalogram (EEG) was made. It was demonstrated that in depression, a neurocognitive deficiency was observed that correlates positively with the intensity of the depressive symptomatology. As well, a reduction occurs in the ability to make decisions based both on logic and reasoning (in the WCST), which is associated with EEG features of hypofrontality, and based on emotional learning (in the IGT test). Only in patients suffering from depression with a reduced ability to make rational decisions based on logic and reasoning was a “compensatory shift” observed toward decision making based on emotions, which leads to relatively higher indices of emotional learning. It is assumed that hypofrontality, which results in difficulties in making decisions requiring logical thought, leads to interruption of subcortical, including hippocampal, structures, an increase in the activation of which is related to better indices of emotional learning.  相似文献   

15.
Model-based analysis of fMRI data is an important tool for investigating the computational role of different brain regions. With this method, theoretical models of behavior can be leveraged to find the brain structures underlying variables from specific algorithms, such as prediction errors in reinforcement learning. One potential weakness with this approach is that models often have free parameters and thus the results of the analysis may depend on how these free parameters are set. In this work we asked whether this hypothetical weakness is a problem in practice. We first developed general closed-form expressions for the relationship between results of fMRI analyses using different regressors, e.g., one corresponding to the true process underlying the measured data and one a model-derived approximation of the true generative regressor. Then, as a specific test case, we examined the sensitivity of model-based fMRI to the learning rate parameter in reinforcement learning, both in theory and in two previously-published datasets. We found that even gross errors in the learning rate lead to only minute changes in the neural results. Our findings thus suggest that precise model fitting is not always necessary for model-based fMRI. They also highlight the difficulty in using fMRI data for arbitrating between different models or model parameters. While these specific results pertain only to the effect of learning rate in simple reinforcement learning models, we provide a template for testing for effects of different parameters in other models.  相似文献   

16.
In mammals, goal-directed and planning processes support flexible behaviour used to face new situations that cannot be tackled through more efficient but rigid habitual behaviours. Within the Bayesian modelling approach of brain and behaviour, models have been proposed to perform planning as probabilistic inference but this approach encounters a crucial problem: explaining how such inference might be implemented in brain spiking networks. Recently, the literature has proposed some models that face this problem through recurrent spiking neural networks able to internally simulate state trajectories, the core function at the basis of planning. However, the proposed models have relevant limitations that make them biologically implausible, namely their world model is trained ‘off-line’ before solving the target tasks, and they are trained with supervised learning procedures that are biologically and ecologically not plausible. Here we propose two novel hypotheses on how brain might overcome these problems, and operationalise them in a novel architecture pivoting on a spiking recurrent neural network. The first hypothesis allows the architecture to learn the world model in parallel with its use for planning: to this purpose, a new arbitration mechanism decides when to explore, for learning the world model, or when to exploit it, for planning, based on the entropy of the world model itself. The second hypothesis allows the architecture to use an unsupervised learning process to learn the world model by observing the effects of actions. The architecture is validated by reproducing and accounting for the learning profiles and reaction times of human participants learning to solve a visuomotor learning task that is new for them. Overall, the architecture represents the first instance of a model bridging probabilistic planning and spiking-processes that has a degree of autonomy analogous to the one of real organisms.  相似文献   

17.
The use of motor learning strategies may enhance rehabilitation outcomes of individuals with neurological injuries (e.g., stroke or cerebral palsy). A common strategy to facilitate learning of challenging tasks is to use sequential progression – i.e., initially reduce task difficulty and slowly increase task difficulty until the desired difficulty level is reached. However, the evidence related to the use of such sequential progressions to improve learning is mixed for functional skill learning tasks, especially considering situations where practice duration is limited. Here, we studied the benefits of sequential progression using a functional motor learning task that has been previously used in gait rehabilitation. Three groups of participants (N = 43) learned a novel motor task during treadmill walking using different learning strategies. Participants in the specific group (n = 21) practiced only the criterion task (i.e., matching a target template that was scaled-up by 30%) throughout the training. Participants in the sequential group (n = 11) gradually progressed to the criterion task (from 3% to 30% in increments of 3%), whereas participants in the random group (n = 11) started at 3% and progressed in random increments (involving both increases and decreases in task difficulty) to the criterion task. At the end of training, kinematic tracking performance on the criterion task was evaluated in all participants both with and without visual feedback. Results indicated that the tracking error was significantly lower in the specific group, and no differences were observed between the sequential and the random progression groups. The findings indicate that the amount of practice in the criterion task is more critical than the difficulty and variations of task practice when learning new gait patterns during treadmill walking.  相似文献   

18.
The brain can learn and detect mixed input signals masked by various types of noise, and spike-timing-dependent plasticity (STDP) is the candidate synaptic level mechanism. Because sensory inputs typically have spike correlation, and local circuits have dense feedback connections, input spikes cause the propagation of spike correlation in lateral circuits; however, it is largely unknown how this secondary correlation generated by lateral circuits influences learning processes through STDP, or whether it is beneficial to achieve efficient spike-based learning from uncertain stimuli. To explore the answers to these questions, we construct models of feedforward networks with lateral inhibitory circuits and study how propagated correlation influences STDP learning, and what kind of learning algorithm such circuits achieve. We derive analytical conditions at which neurons detect minor signals with STDP, and show that depending on the origin of the noise, different correlation timescales are useful for learning. In particular, we show that non-precise spike correlation is beneficial for learning in the presence of cross-talk noise. We also show that by considering excitatory and inhibitory STDP at lateral connections, the circuit can acquire a lateral structure optimal for signal detection. In addition, we demonstrate that the model performs blind source separation in a manner similar to the sequential sampling approximation of the Bayesian independent component analysis algorithm. Our results provide a basic understanding of STDP learning in feedback circuits by integrating analyses from both dynamical systems and information theory.  相似文献   

19.
We explore humans’ rule-based category learning using analytic approaches that highlight their psychological transitions during learning. These approaches confirm that humans show qualitatively sudden psychological transitions during rule learning. These transitions contribute to the theoretical literature contrasting single vs. multiple category-learning systems, because they seem to reveal a distinctive learning process of explicit rule discovery. A complete psychology of categorization must describe this learning process, too. Yet extensive formal-modeling analyses confirm that a wide range of current (gradient-descent) models cannot reproduce these transitions, including influential rule-based models (e.g., COVIS) and exemplar models (e.g., ALCOVE). It is an important theoretical conclusion that existing models cannot explain humans’ rule-based category learning. The problem these models have is the incremental algorithm by which learning is simulated. Humans descend no gradient in rule-based tasks. Very different formal-modeling systems will be required to explain humans’ psychology in these tasks. An important next step will be to build a new generation of models that can do so.  相似文献   

20.
Acute inflammation is a severe medical condition defined as an inflammatory response of the body to an infection. Its rapid progression requires quick and accurate decisions from clinicians. Inadequate and delayed decisions makes acute inflammation the 10th leading cause of death overall in United States with the estimated cost of treatment about $17 billion annually. However, despite the need, there are limited number of methods that could assist clinicians to determine optimal therapies for acute inflammation. We developed a data-driven method for suggesting optimal therapy by using machine learning model that is learned on historical patients' behaviors. To reduce both the risk of failure and the expense for clinical trials, our method is evaluated on a virtual patients generated by a mathematical model that emulates inflammatory response. In conducted experiments, acute inflammation was handled with two complimentary pro- and anti-inflammatory medications which adequate timing and doses are crucial for the successful outcome. Our experiments show that the dosage regimen assigned with our data-driven method significantly improves the percentage of healthy patients when compared to results by other methods used in clinical practice and found in literature. Our method saved 88% of patients that would otherwise die within a week, while the best method found in literature saved only 73% of patients. At the same time, our method used lower doses of medications than alternatives. In addition, our method achieved better results than alternatives when only incomplete or noisy measurements were available over time as well as it was less affected by therapy delay. The presented results provide strong evidence that models from the artificial intelligence community have a potential for development of personalized treatment strategies for acute inflammation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号