期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Very Slow Search and Reach: Failure to Maximize Expected Gain in an Eye-Hand Coordination Task

Hang Zhang Camille Morvan Louis-Alexandre Etezad-Heydari Laurence T. Maloney 《PLoS computational biology》2012,8(10)

We examined an eye-hand coordination task where optimal visual search and hand movement strategies were inter-related. Observers were asked to find and touch a target among five distractors on a touch screen. Their reward for touching the target was reduced by an amount proportional to how long they took to locate and reach to it. Coordinating the eye and the hand appropriately would markedly reduce the search-reach time. Using statistical decision theory we derived the sequence of interrelated eye and hand movements that would maximize expected gain and we predicted how hand movements should change as the eye gathered further information about target location. We recorded human observers'' eye movements and hand movements and compared them with the optimal strategy that would have maximized expected gain. We found that most observers failed to adopt the optimal search-reach strategy. We analyze and describe the strategies they did adopt. 相似文献

2.

Humans can adopt optimal discounting strategy under real-time constraints

下载免费PDF全文

Schweighofer N Shishida K Han CE Okamoto Y Tanaka SC Yamawaki S Doya K 《PLoS computational biology》2006,2(11):e152

Critical to our many daily choices between larger delayed rewards, and smaller more immediate rewards, are the shape and the steepness of the function that discounts rewards with time. Although research in artificial intelligence favors exponential discounting in uncertain environments, studies with humans and animals have consistently shown hyperbolic discounting. We investigated how humans perform in a reward decision task with temporal constraints, in which each choice affects the time remaining for later trials, and in which the delays vary at each trial. We demonstrated that most of our subjects adopted exponential discounting in this experiment. Further, we confirmed analytically that exponential discounting, with a decay rate comparable to that used by our subjects, maximized the total reward gain in our task. Our results suggest that the particular shape and steepness of temporal discounting is determined by the task that the subject is facing, and question the notion of hyperbolic reward discounting as a universal principle. 相似文献

3.

Statistical decision theory and trade-offs in the control of motor response

Trommershäuser J Maloney LT Landy MS 《Spatial Vision》2003,16(3-4):255-275

We present a novel approach to the modeling of motor responses based on statistical decision theory. We begin with the hypothesis that subjects are ideal motion planners who choose movement trajectories to minimize expected loss. We derive predictions of the hypothesis for movement in environments where contact with specified regions carries rewards or penalties. The model predicts shifts in a subject's aiming point in response to changes in the reward and penalty structure of the environment and with changes in the subject's uncertainty in carrying out planned movements. We tested some of these predictions in an experiment where subjects were rewarded if they succeeded in touching a target region on a computer screen within a specified time limit. Near the target was a penalty region which, if touched, resulted in a penalty. We varied distance between the penalty region and the target and the cost of hitting the penalty region. Subjects shift their mean points of contact with the computer screen in response to changes in penalties and location of the penalty region relative to the target region in qualitative agreement with the predictions of the hypothesis. Thus, movement planning takes into account extrinsic costs and the subject's own motor uncertainty. 相似文献

4.

Interference and Shaping in Sensorimotor Adaptations with Rewards

Ran Darshan Arthur Leblois David Hansel 《PLoS computational biology》2014,10(1)

When a perturbation is applied in a sensorimotor transformation task, subjects can adapt and maintain performance by either relying on sensory feedback, or, in the absence of such feedback, on information provided by rewards. For example, in a classical rotation task where movement endpoints must be rotated to reach a fixed target, human subjects can successfully adapt their reaching movements solely on the basis of binary rewards, although this proves much more difficult than with visual feedback. Here, we investigate such a reward-driven sensorimotor adaptation process in a minimal computational model of the task. The key assumption of the model is that synaptic plasticity is gated by the reward. We study how the learning dynamics depend on the target size, the movement variability, the rotation angle and the number of targets. We show that when the movement is perturbed for multiple targets, the adaptation process for the different targets can interfere destructively or constructively depending on the similarities between the sensory stimuli (the targets) and the overlap in their neuronal representations. Destructive interferences can result in a drastic slowdown of the adaptation. As a result of interference, the time to adapt varies non-linearly with the number of targets. Our analysis shows that these interferences are weaker if the reward varies smoothly with the subject''s performance instead of being binary. We demonstrate how shaping the reward or shaping the task can accelerate the adaptation dramatically by reducing the destructive interferences. We argue that experimentally investigating the dynamics of reward-driven sensorimotor adaptation for more than one sensory stimulus can shed light on the underlying learning rules. 相似文献

5.

Food-exchange with humans in brown capuchin monkeys

Drapier M Chauvin C Dufour V Uhlrich P Thierry B 《Primates; journal of primatology》2005,46(4):241-248

To assess how brown capuchin monkeys (Cebus apella) delay gratification and maximize payoff, we carried out four experiments in which six subjects could exchange food pieces with a human experimenter. The pieces differed either in quality or quantity. In qualitative exchanges, all subjects gave a piece of food to receive another of higher value. When the difference of value between the rewards to be returned and those expected was higher, subjects performed better. Only two subjects refrained from nibbling the piece of food before returning it. All subjects performed two or three qualitative exchanges in succession to obtain a given reward. In quantitative exchanges, three subjects returned a food item to obtain a bigger one, but two of them nibbled the item before returning it. Individual differences were marked. Subjects had some difficulties when the food to be returned was similar or equal in quality to that expected. 相似文献

6.

Can Monkeys Make Investments Based on Maximized Pay-off?

Steelandt S Dufour V Broihanne MH Thierry B 《PloS one》2011,6(3):e17801

Animals can maximize benefits but it is not known if they adjust their investment according to expected pay-offs. We investigated whether monkeys can use different investment strategies in an exchange task. We tested eight capuchin monkeys (Cebus apella) and thirteen macaques (Macaca fascicularis, Macaca tonkeana) in an experiment where they could adapt their investment to the food amounts proposed by two different experimenters. One, the doubling partner, returned a reward that was twice the amount given by the subject, whereas the other, the fixed partner, always returned a constant amount regardless of the amount given. To maximize pay-offs, subjects should invest a maximal amount with the first partner and a minimal amount with the second. When tested with the fixed partner only, one third of monkeys learned to remove a maximal amount of food for immediate consumption before investing a minimal one. With both partners, most subjects failed to maximize pay-offs by using different decision rules with each partner' quality. A single Tonkean macaque succeeded in investing a maximal amount to one experimenter and a minimal amount to the other. The fact that only one of over 21 subjects learned to maximize benefits in adapting investment according to experimenters' quality indicates that such a task is difficult for monkeys, albeit not impossible. 相似文献

7.

Probability of Seeing Increases Saccadic Readiness

Thérèse Collins 《PloS one》2012,7(11)

Associating movement directions or endpoints with monetary rewards or costs influences movement parameters in humans, and associating movement directions or endpoints with food reward influences movement parameters in non-human primates. Rewarded movements are facilitated relative to non-rewarded movements. The present study examined to what extent successful foveation facilitated saccadic eye movement behavior, with the hypothesis that foveation may constitute an informational reward. Human adults performed saccades to peripheral targets that either remained visible after saccade completion or were extinguished, preventing visual feedback. Saccades to targets that were systematically extinguished were slower and easier to inhibit than saccades to targets that afforded successful foveation, and this effect was modulated by the probability of successful foveation. These results suggest that successful foveation facilitates behavior, and that obtaining the expected sensory consequences of a saccadic eye movement may serve as a reward for the oculomotor system. 相似文献

8.

Speeded Reaching Movements around Invisible Obstacles

Todd E. Hudson Uta Wolfe Laurence T. Maloney 《PLoS computational biology》2012,8(9)

We analyze the problem of obstacle avoidance from a Bayesian decision-theoretic perspective using an experimental task in which reaches around a virtual obstacle were made toward targets on an upright monitor. Subjects received monetary rewards for touching the target and incurred losses for accidentally touching the intervening obstacle. The locations of target-obstacle pairs within the workspace were varied from trial to trial. We compared human performance to that of a Bayesian ideal movement planner (who chooses motor strategies maximizing expected gain) using the Dominance Test employed in Hudson et al. (2007). The ideal movement planner suffers from the same sources of noise as the human, but selects movement plans that maximize expected gain in the presence of that noise. We find good agreement between the predictions of the model and actual performance in most but not all experimental conditions. 相似文献

9.

Does General Motivation Energize Financial Reward-Seeking Behavior? Evidence from an Effort Task

Justin Chumbley Ernst Fehr 《PloS one》2014,9(9)

We aimed to predict how hard subjects work for financial rewards from their general trait and state reward-motivation. We specifically asked 1) whether individuals high in general trait “reward responsiveness” work harder 2) whether task-irrelevant cues can make people work harder, by increasing general motivation. Each trial of our task contained a 1 second earning interval in which male subjects earned money for each button press. This was preceded by one of three predictive cues: an erotic picture of a woman, a man, or a geometric figure. We found that individuals high in trait “reward responsiveness” worked harder and earned more, irrespective of the predictive cue. Because female predictive cues are more rewarding, we expected them to increase general motivation in our male subjects and invigorate work, but found a more complex pattern. 相似文献

10.

Gambling in the visual periphery: a conjoint-measurement analysis of human ability to judge visual uncertainty

Zhang H Morvan C Maloney LT 《PLoS computational biology》2010,6(12):e1001023

Recent work in motor control demonstrates that humans take their own motor uncertainty into account, adjusting the timing and goals of movement so as to maximize expected gain. Visual sensitivity varies dramatically with retinal location and target, and models of optimal visual search typically assume that the visual system takes retinal inhomogeneity into account in planning eye movements. Such models can then use the entire retina rather than just the fovea to speed search. Using a simple decision task, we evaluated human ability to compensate for retinal inhomogeneity. We first measured observers'' sensitivity for targets, varying contrast and eccentricity. Observers then repeatedly chose between targets differing in eccentricity and contrast, selecting the one they would prefer to attempt: e.g., a low contrast target at 2° versus a high contrast target at 10°. Observers knew they would later attempt some of their chosen targets and receive rewards for correct classifications. We evaluated performance in three ways. Equivalence: Do observers'' judgments agree with their actual performance? Do they correctly trade off eccentricity and contrast and select the more discriminable target in each pair? Transitivity: Are observers'' choices self-consistent? Dominance: Do observers understand that increased contrast improves performance? Decreased eccentricity? All observers exhibited patterned failures of equivalence, and seven out of eight observers failed transitivity. There were significant but small failures of dominance. All these failures together reduced their winnings by 10%–18%. 相似文献

11.

Improved dairy cattle mating plans at herd level using genomic information

《Animal : an international journal of animal bioscience》2021,15(1):100016

From 2012 to 2018, 223 180 Montbéliarde females were genotyped in France and the number of newly genotyped females increased at a rate of about 33% each year. With female genotyping information, farmers have access to the genomic estimated breeding values of the females in their herd and to their carrier status for genetic defects or major genes segregating in the breed. This information, combined with genomic coancestry, can be used when planning matings in order to maximize the expected on-farm profit of future female offspring. We compared different mating allocation approaches for their capacity to maximize the expected genetic gain while limiting expected progeny inbreeding and the probability to conceive an offspring homozygous for a lethal recessive allele. Three mate allocation strategies (random mating (RAND), sequential mating (gSEQ€) and linear programing mating (gLP€)) were compared on 160 actual Montbéliarde herds using male and female genomic information. Then, we assessed the benefit of using female genomic information by comparing matings planned using only female pedigree information with the equivalent strategy using genomic information. We measured the benefit of adding genomic expected inbreeding and risk of conception of an offspring homozygous for a lethal recessive allele to Net merit in mating plans. The influence of three constraints was tested: by relaxing the constraint on availability of a particular semen type (sexed or conventional) for bulls, by adding an upper limit of 8.5% coancestry between mate pairs or by using a more stringent maximum use of a bull in a herd (5% vs 10%). The use of genomic information instead of pedigree information improved the mate allocation method in terms of progeny expected genetic merit, genetic diversity and risk to conceive an offspring homozygous for a lethal recessive allele. Optimizing mate allocation using linear programming and constraining coancestry to a maximum of 8.5% per mate pair reduced the average coancestry with a small impact on expected Net Merit. In summary, for male and female selection pathways, using genomic information is more efficient than using pedigree information to maximize genetic gain while constraining the expected inbreeding of the progeny and the risk to conceive an offspring homozygous for a lethal recessive allele. This study also underlines the key role of semen type (sexed vs conventional) and the associated constraints on the mate allocation algorithm to maximize genetic gain while maintaining genetic diversity and limiting the risk to conceive an offspring homozygous for a lethal recessive allele. 相似文献

12.

Midbrain dopamine neurons encode a quantitative reward prediction error signal 总被引：15，自引：0，他引：15

Bayer HM Glimcher PW 《Neuron》2005,47(1):129-141

相似文献

13.

Get It While It’s Hot: A Peak-First Bias in Self-Generated Choice Order in Rhesus Macaques

Kanghoon Jung Jerald D. Kralik 《PloS one》2013,8(12)

Animals typically must make a number of successive choices to achieve a goal: e.g., eating multiple food items before becoming satiated. However, it is unclear whether choosing the best first or saving the best for last represents the best choice strategy to maximize overall reward. Specifically, since outcomes can be evaluated prospectively (with future rewards discounted and more immediate rewards preferred) or retrospectively (with prior rewards discounted and more recent rewards preferred), the conditions under which each are used remains unclear. On the one hand, humans and non-human animals clearly discount future reward, preferring immediate rewards to delayed ones, suggesting prospective evaluation; on the other hand, it has also been shown that a sequence that ends well, i.e., with the best event or item last, is often preferred, suggesting retrospective evaluation. Here we hypothesized that when individuals are allowed to build the sequence themselves they are more likely to evaluate each item individually and therefore build a sequence using prospective evaluation. We examined the relationship between self-generated choice order and preference in rhesus monkeys in two experiments in which the distinctiveness of options were relatively high and low, respectively. We observed a positive linear relationship between choice order and preference among highly distinct options, indicating that the rhesus monkeys chose their preferred food first: i.e., a peak-first order preference. Overall, choice order depended on the degree of relative preference among alternatives and a peak-first bias, providing evidence for prospective evaluation when choice order is self-generated. 相似文献

14.

The Neural Circuitry of Reward Processing in Complex Social Comparison: Evidence from an Event-Related fMRI Study

Xue Du Meng Zhang DongTao Wei Wenfu Li Qinglin Zhang Jiang Qiu 《PloS one》2013,8(12)

In this study, Functional magnetic resonance imaging (fMRI) was conducted to investigate the mechanisms by which the brain activity in a complex social comparison context. One true subject and two pseudo-subjects were asked to complete a simple number estimate task at the same time which including upward and downward comparisons. Two categories of social comparison rewards (fair and unfair rewards distributions) were mainly presented by comparing the true subject with other two pseudo-subjects. Particularly, there were five conditions of unfair distribution when all the three subjects were correct but received different rewards. Behavioral data indicated that the ability to self-regulate was important in satisfaction judgment when the subject perceived an unfair reward distribution. fMRI data indicated that the interaction between the ventral striatum and the prefrontal cortex was important in self-regulation under specific conditions in complex social comparison, especially under condition of reward processing when there were two different reward values and the subject failed to exhibit upward comparison. 相似文献

15.

Hedging Your Bets: Intermediate Movements as Optimal Behavior in the Context of an Incomplete Decision

Adrian M. Haith David M. Huberdeau John W. Krakauer 《PLoS computational biology》2015,11(3)

Existing theories of movement planning suggest that it takes time to select and prepare the actions required to achieve a given goal. These theories often appeal to circumstances where planning apparently goes awry. For instance, if reaction times are forced to be very low, movement trajectories are often directed between two potential targets. These intermediate movements are generally interpreted as errors of movement planning, arising either from planning being incomplete or from parallel movement plans interfering with one another. Here we present an alternative view: that intermediate movements reflect uncertainty about movement goals. We show how intermediate movements are predicted by an optimal feedback control model that incorporates an ongoing decision about movement goals. According to this view, intermediate movements reflect an exploitation of compatibility between goals. Consequently, reducing the compatibility between goals should reduce the incidence of intermediate movements. In human subjects, we varied the compatibility between potential movement goals in two distinct ways: by varying the spatial separation between targets and by introducing a virtual barrier constraining trajectories to the target and penalizing intermediate movements. In both cases we found that decreasing goal compatibility led to a decreasing incidence of intermediate movements. Our results and theory suggest a more integrated view of decision-making and movement planning in which the primary bottleneck to generating a movement is deciding upon task goals. Determining how to move to achieve a given goal is rapid and automatic. 相似文献

16.

Nectarivore foraging ecology: rewards differing in sugar types 总被引：1，自引：0，他引：1

HARRINGTON WELLS PEGGY S. HILL PATRICK H. WELLS 《Ecological Entomology》1992,17(3):280-288

Abstract.

1 Honey bees, visiting artificial flower patches, were used as a model system to study the effects of sugar type (sucrose, glucose, fructose, and mixed monosaccharide), caloric reward, and floral colour on nectarivore foraging behaviour. Observed behaviour was compared to the predictions of various (sometimes contradictory) foraging models.
2 Bees drank indiscriminately from flowers in patches with a blue-white flower dimorphism when caloric values of rewards were equal (e.g. 1M sucrose in both colours; 1 M sucrose versus 2 M monosaccharide of either type), but when nectar caloric rewards were unequal, they switched to the flower colour with the calorically greater reward.
3 In yellow-blue dimorphic flower patches, on the other hand, bees did not maximize caloric reward. Rather, bees were individually constant, some to blue, others to yellow, regardless of the sugar types or energy content of the rewards provided in the two flower morphs.
4 The results suggest that optimal foraging theory (maximization of net caloric gain per unit time) is a robust predictor of behaviour with regard to the sugar types common to nectars; such optimal foraging is, however, limited by a superstructure of individual constancy.

相似文献

17.

Optimal compensation for temporal uncertainty in movement planning

Hudson TE Maloney LT Landy MS 《PLoS computational biology》2008,4(7):e1000130

Motor control requires the generation of a precise temporal sequence of control signals sent to the skeletal musculature. We describe an experiment that, for good performance, requires human subjects to plan movements taking into account uncertainty in their movement duration and the increase in that uncertainty with increasing movement duration. We do this by rewarding movements performed within a specified time window, and penalizing slower movements in some conditions and faster movements in others. Our results indicate that subjects compensated for their natural duration-dependent temporal uncertainty as well as an overall increase in temporal uncertainty that was imposed experimentally. Their compensation for temporal uncertainty, both the natural duration-dependent and imposed overall components, was nearly optimal in the sense of maximizing expected gain in the task. The motor system is able to model its temporal uncertainty and compensate for that uncertainty so as to optimize the consequences of movement. 相似文献

18.

Absence of Spatial Tuning in the Orbitofrontal Cortex

Lauren E. Grattan Paul W. Glimcher 《PloS one》2014,9(11)

There is limited data in the literature to explicitly support the notion that neurons in OFC are truly action-independent in their coding. We set out to specifically test the hypothesis that OFC value-related neurons in area 13 m of the monkey do not carry information about the action required to obtain that reward – that activity in this area represents reward values in an abstract and action-independent manner. To accomplish that goal we had two monkeys select and execute saccadic eye movements to 81 locations in the visual field for three different kinds of juice rewards. Our detailed analysis of the response fields indicates that these neurons are insensitive to the amplitude or direction of the saccade required to obtain these rewards. Our data thus validate earlier proposals that neurons of 13 m in the OFC encode subjective value independent of the saccadic action required to obtain that reward. 相似文献

19.

Resource allocation for epidemic control over short time horizons

Zaric GS Brandeau ML 《Mathematical biosciences》2001,171(1):33-58

We present a model for allocation of epidemic control resources among a set of interventions. We assume that the epidemic is modeled by a general compartmental epidemic model, and that interventions change one or more of the parameters that describe the epidemic. Associated with each intervention is a 'production function' that relates the amount invested in the intervention to values of parameters in the epidemic model. The goal is to maximize quality-adjusted life years gained or the number of new infections averted over a fixed time horizon, subject to a budget constraint. Unlike previous models, our model allows for interacting populations and non-linear interacting production functions and does not require a long time horizon. We show that an analytical solution to the model may be difficult or impossible to derive, even for simple cases. Therefore, we derive a method of approximating the objective functions. We use the approximations to gain insight into the optimal resource allocation for three problem instances. We also develop heuristics for solving the general resource allocation problem. We present results of numerical studies using our approximations and heuristics. Finally, we discuss implications and applications of this work. 相似文献

20.

Will travel for food: spatial discounting in two new world monkeys 总被引：6，自引：0，他引：6

Stevens JR Rosati AG Ross KR Hauser MD 《Current biology : CB》2005,15(20):1855-1860

Nonhuman animals steeply discount the future, showing a preference for small, immediate over large, delayed rewards. Currently unclear is whether discounting functions depend on context. Here, we examine the effects of spatial context on discounting in cotton-top tamarins (Saguinus oedipus) and common marmosets (Callithrix jacchus), species known to differ in temporal discounting. We presented subjects with a choice between small, nearby rewards and large, distant rewards. Tamarins traveled farther for the large reward than marmosets, attending to the ratio of reward differences rather than their absolute values. This species difference contrasts with performance on a temporal task in which marmosets waited longer than tamarins for the large reward. These comparative data indicate that context influences choice behavior, with the strongest effect seen in marmosets who discounted more steeply over space than over time. These findings parallel details of each species' feeding ecology. Tamarins range over large distances and feed primarily on insects, which requires using quick, impulsive action. Marmosets range over shorter distances than tamarins and feed primarily on tree exudates, a clumped resource that requires patience to wait for sap to exude. These results show that discounting functions are context specific, shaped by a history of ecological pressures. 相似文献