首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The development of a secondary reinforcer as a result of associating a neutral stimulus (buzzer) with intravenous (IV) doses of morpine was studied in rats. Secondary reinforcement developed in the absence of physical dependence and followed the association of the stimulus with either response-contingent or non-contingent injections of morphine. Strength of the conditioned reinforcer, measured in terms of responding on a lever for the stimulus plus infusion of saline solution, was proportional to the unit dosage of morphine employed in pairings of buzzer and drug. When extinction of the lever-press response for IV morphine was conducted (by substituting saline for morphine solution) in the absence of the conditioned reinforcing stimulus, it was seen later that the stimulus could still elicit lever responses, until it too had been present for a sufficient interval of non-reinforced responding. Similarly, extinction of the response for morphine by blocking its action with naloxone in the absence of the stimulus did not eliminate the conditioned reinforcement. Another study showed that a passive, subcutaneous (SC) dose of morphine served to maintain lever-pressing on a contingency of buzzer plus saline infusion. Furthermore, the stimuli resulting from the presence of morphine (after a SC injection) were able to reinstate the lever-responding with only the buzzer-saline contingency when such responses had previously been extinguished. Moreover, it was shown that d-amphetamine could restore responding under the same conditions, and that morphine could also do so for rats in which the primary reinforcer had been d-amphetamine. It is suggested that animal data such as these show that procedures designed for the elimination of human drug-taking behavior must take into account secondary reinforcers as well as the primary reinforcer(s).  相似文献   

2.
Empirical quantitative models were constructed for Populus deltoidesdescribing temporal and spatial changes in vessel characteristicsof metaxylem, both within individual central leaf traces andwithin all central leaf traces considered as a morphologicalunit at a given transverse level in the stem (the central tracesympodia). Similar models were constructed for secondary vesselcharacteristics. The growth processes of the stem segment throughwhich the vasculature extended were incorporated in these modelsto illustrate how a functional vascular system is maintainedin the stem as a whole. The central trace sympodia representedthe integrals of the temporal and spatial functions for individualcentral leaf traces. Metaxylem vessel production ceased in individualleaf traces two plastochrons before the cessation was reflectedin the central trace sympodia because of the integrative natureof the sympodial complex. A functional continuum of developmentwas apparent between metaxylem vessels of the central tracesympodia and secondary vessels of the stem. The transition betweenmetaxylem and secondary xylem production in the central tracesympodia corresponded with cessation of leaf and internode elongation. Populus deltoides Bartr. ex Marsh., cottonwood, primary xylem, secondary xylem, primary-secondary vascular transition, leaf growth, xylogenesis  相似文献   

3.
It is shown that the current “two-factor” theory of nerve excitation can account for sustained inhibition or enhancement by a sequence of stimulus pulses, and for the decrease in the reinforcement period with each successive pulse of the train.  相似文献   

4.
In a large variety of situations one would like to have an expressive and accurate model of observed animal or human behavior. While general purpose mathematical models may capture successfully properties of observed behavior, it is desirable to root models in biological facts. Because of ample empirical evidence for reward-based learning in visuomotor tasks, we use a computational model based on the assumption that the observed agent is balancing the costs and benefits of its behavior to meet its goals. This leads to using the framework of reinforcement learning, which additionally provides well-established algorithms for learning of visuomotor task solutions. To quantify the agent’s goals as rewards implicit in the observed behavior, we propose to use inverse reinforcement learning, which quantifies the agent’s goals as rewards implicit in the observed behavior. Based on the assumption of a modular cognitive architecture, we introduce a modular inverse reinforcement learning algorithm that estimates the relative reward contributions of the component tasks in navigation, consisting of following a path while avoiding obstacles and approaching targets. It is shown how to recover the component reward weights for individual tasks and that variability in observed trajectories can be explained succinctly through behavioral goals. It is demonstrated through simulations that good estimates can be obtained already with modest amounts of observation data, which in turn allows the prediction of behavior in novel configurations.  相似文献   

5.
Operant generalization has been demonstrated in neonates only recently. To investigate the development of intradimensional stimulus control immediately after hatching, northern bobwhite chicks (Colinus virginianus) pecked for brief heat presentations while hearing a high-pitched sound repeated at two constant rates: an S+ tempo signaling a rich reinforcement schedule, alternating with an S− tempo signaling a leaner schedule. Tempo generalization was then assessed in extinction. The expected excitatory gradients were produced after a threshold number of training sessions; unexpectedly, below that threshold, gradients were inhibitory. The chicks’ rapidly developing thermoregulatory capability may have resulted in a change from perceived negative reinforcement initially to positive reinforcement later. Given past research showing excitatory gradients after negative reinforcement, we suggest that these results demonstrate that all negative reinforcement is not equivalent, and, further, that classical conditioning effects require consideration.  相似文献   

6.
The effect of nonsemantic context on the perception of simple nonverbal visual stimuli has been studied in ten healthy volunteers by the event-related potential (ERP) method. The nonsemantic context was specified by the formation of a memory trace of a test visual stimulus via its repeated presentation without any instruction except gaze fixation. Then, this stimulus randomly alternated with control stimuli that did not form memory traces before their presentation. It has been found that an ERP in the interval 260–340 ms after presentation of a simple nonverbal stimulus significantly differs from the control ERPs. The results suggest that some stages of the processing of visual stimuli may be modified by nonsemantic context.  相似文献   

7.
This paper investigates the effectiveness of spiking agents when trained with reinforcement learning (RL) in a challenging multiagent task. In particular, it explores learning through reward-modulated spike-timing dependent plasticity (STDP) and compares it to reinforcement of stochastic synaptic transmission in the general-sum game of the Iterated Prisoner's Dilemma (IPD). More specifically, a computational model is developed where we implement two spiking neural networks as two "selfish" agents learning simultaneously but independently, competing in the IPD game. The purpose of our system (or collective) is to maximise its accumulated reward in the presence of reward-driven competing agents within the collective. This can only be achieved when the agents engage in a behaviour of mutual cooperation during the IPD. Previously, we successfully applied reinforcement of stochastic synaptic transmission to the IPD game. The current study utilises reward-modulated STDP with eligibility trace and results show that the system managed to exhibit the desired behaviour by establishing mutual cooperation between the agents. It is noted that the cooperative outcome was attained after a relatively short learning period which enhanced the accumulation of reward by the system. As in our previous implementation, the successful application of the learning algorithm to the IPD becomes possible only after we extended it with additional global reinforcement signals in order to enhance competition at the neuronal level. Moreover it is also shown that learning is enhanced (as indicated by an increased IPD cooperative outcome) through: (i) strong memory for each agent (regulated by a high eligibility trace time constant) and (ii) firing irregularity produced by equipping the agents' LIF neurons with a partial somatic reset mechanism.  相似文献   

8.
Probability analysis was carried out of the appearance of single elements of rats behaviour in the process of extinction of a conditioned alimentary motor reflex. The dynamics of effector behavioural components at a sudden cessation of reinforcement (usual schedule of extinction) was compared with cessation of reinforcement signalled by a previously differentiated signal and with reinforcement cessation preceded by a stimulus initially unknown to the animal. If the reinforcement cessation is signalled by a previously differentiated (negative) stimulus, in response to its action the animals "loose the aim", what is revealed in a rapid complete reduction of all elements of the goal-directed alimentary behaviour. Obviously differentiation signal actualises the memory trace of "nonreinforcement" which was formed in the previous negative experience of the animal; this is revealed in accelerated inhibition of the alimentary motor reflex under extinction.  相似文献   

9.
Human behavior displays hierarchical structure: simple actions cohere into subtask sequences, which work together to accomplish overall task goals. Although the neural substrates of such hierarchy have been the target of increasing research, they remain poorly understood. We propose that the computations supporting hierarchical behavior may relate to those in hierarchical reinforcement learning (HRL), a machine-learning framework that extends reinforcement-learning mechanisms into hierarchical domains. To test this, we leveraged a distinctive prediction arising from HRL. In ordinary reinforcement learning, reward prediction errors are computed when there is an unanticipated change in the prospects for accomplishing overall task goals. HRL entails that prediction errors should also occur in relation to task subgoals. In three neuroimaging studies we observed neural responses consistent with such subgoal-related reward prediction errors, within structures previously implicated in reinforcement learning. The results reported support the relevance of HRL to the neural processes underlying hierarchical behavior.  相似文献   

10.
Partial reinforcement (PR) effects on animal locomotor behavior were studied in the golden hamster, using food-hoarding activity as a reinforcer. The first experiment demonstrated that hoarding reinforces a running response towards the goal section of a straight-alley runway, and that no such learning occurs when sated hamsters were not allowed to hoard food. However, a second experiment using various partial reinforcement schedules and a continuous reinforcement schedule did not give any evidence for the existence of a partial reinforcement acquisition effect (PRAE). The third experiment confirmed these results with an extended training procedure and showed a slight partial reinforcement extinction effect (PREE) mainly in the first sessions of the extinction phase.  相似文献   

11.
It is extremely difficult to trace the causal pathway relating gene products or molecular pathways to the expression of behavior. This is especially true for social behavior, which being dependent on interactions and communication between individuals is even further removed from molecular-level events. In this review, we discuss how behavioral models can aid molecular analyses of social behavior. Various models of behavior exist, each of which suggest strategies to dissect complex behavior into simpler behavioral 'modules.' The resulting modules are easier to relate to neural processes and thus suggest hypotheses for neural and molecular function. Here we discuss how three different models of behavior have facilitated understanding the molecular bases of aspects of social behavior. We discuss the response threshold model and two different approaches to modeling motivation, the state space model and models of reinforcement and reward processing. The examples we have chosen illustrate how models can generate testable hypotheses for neural and molecular function and also how molecular analyses probe the validity of a model of behavior. We do not champion one model over another; rather, our examples illustrate how modeling and molecular analyses can be synergistic in exploring the molecular bases of social behavior.  相似文献   

12.
The present article outlines the contribution of the mismatch negativity (MMN), and its magnetic equivalent MMNm, to our understanding of the perception of speech sounds in the human brain. MMN data indicate that each sound, both speech and non-speech, develops its neural representation corresponding to the percept of this sound in the neurophysiological substrate of auditory sensory memory. The accuracy of this representation, determining the accuracy of the discrimination between different sounds, can be probed with MMN separately for any auditory feature or stimulus type such as phonemes. Furthermore, MMN data show that the perception of phonemes, and probably also of larger linguistic units (syllables and words), is based on language-specific phonetic traces developed in the posterior part of the left-hemisphere auditory cortex. These traces serve as recognition models for the corresponding speech sounds in listening to speech.  相似文献   

13.
Fear conditioning can be rapidly obtained over long trace intervals, but its specificity with respect to both time and stimulus is uncertain. Long-trace fear conditioning often parallels contextual conditioning, and it is sensitive to hippocampal lesions. These properties of trace conditioning are not directly addressed by timing models and multiple-time-scale models of conditioning. It is proposed that during early stages of conditioning, a joint representation of the context and the stimulus trace may underlie conditioned responses, and that discriminative processes allow the emergence of specific responses in a later stage.  相似文献   

14.
Reinforcement learning is ubiquitous. Unlike other forms of learning, it involves the processing of fast yet content-poor feedback information to correct assumptions about the nature of a task or of a set of stimuli. This feedback information is often delivered as generic rewards or punishments, and has little to do with the stimulus features to be learned. How can such low-content feedback lead to such an efficient learning paradigm? Through a review of existing neuro-computational models of reinforcement learning, we suggest that the efficiency of this type of learning resides in the dynamic and synergistic cooperation of brain systems that use different levels of computations. The implementation of reward signals at the synaptic, cellular, network and system levels give the organism the necessary robustness, adaptability and processing speed required for evolutionary and behavioral success.  相似文献   

15.
Some of the most frequently used methods in the study of conditioned reinforcement seem to be insufficient to demonstrate the effect. The clearest way to assess this phenomenon is the training of a new response. In the present study, rats were exposed to a situation in which a primary reinforcer and an arbitrary stimulus were paired and subsequently the effect of this arbitrary event was assessed by presenting it following a new response. Subjects under these conditions emitted more responses compared to their own responding before the pairing and to their responding on a similar operandum that was available concurrently that had no programmed consequences. Response rates also were higher compared to responding by subjects in similar conditions in which there was no contingency (a) between the arbitrary stimulus and the reinforcer, (b) between the response and the arbitrary stimulus or (c) both. Results are discussed in terms of necessary and sufficient conditions to study conditioned reinforcement.  相似文献   

16.
Adult healthy subjects did not manifest any difference in latency and amplitude of the wave P300 elicited by a positive ("good") and negative ("error") reinforcing stimuli. After the negative reinforcement, the P300 wave amplitude decreases in response to the standard stimulus (light bars) and increases to a lesser degree in response to test stimuli (the same bars but presented with different pauses). In the processes of learning to assess time microintervals in comparison with the standard, the latency of wave P300 to the test stimuli shortens. It is suggested that formation and consolidation of feedback connection elaborated with the participation of a reinforcing verbal stimulus constitute the physiological basis for learning of comparative assessment of time microintervals.  相似文献   

17.
There are three basic paradigms of classical conditioning: delay, trace and context conditioning where presentation of a conditioned stimulus (CS) or a context typically predicts an unconditioned stimulus (US). In delay conditioning CS and US normally coterminate, whereas in trace conditioning an interval of time exists between CS termination and US onset. The modeling of trace conditioning is a rather difficult computational problem and is a challenge to the behavior and connectionist approaches mainly due to a time gap between CS and US. To account for trace conditioning, Pavlov (Conditioned reflexes: an investigation of the physiological activity of the cerebral cortex, Oxford University Press, London, 1927) postulated the existence of a stimulus “trace” in the nervous system. Meanwhile, there exist many other options for solving this association problem. There are several excellent reviews of computational models of classical conditioning but none has thus far been devoted to trace conditioning. Eight representative models of trace conditioning aimed at building a prospective model are being reviewed below in a brief form. As a result, one of them, comprising the most important features of its predecessors, can be suggested as a real candidate for a unified model of trace conditioning.  相似文献   

18.
The role of schedules of reinforcement on the development of superstitious conditioning was investigated in a college age population. Participants were randomly assigned to one of eight operant schedules and instructed to remove (escape), prevent and/or remove (avoidance and escape) or produce (positive) the appearance of a computer generated stimulus using a response pad. Results from the experiment indicate that concomitant (escape and avoidance) schedules of reinforcement are most effective in facilitating acquisition of superstitious behavior as measured by self-reports of participants.  相似文献   

19.
This paper reviews the author's studies on neurophysiologic mechanisms of conditioned reflex learning. Electroencephalograms, evoked potentials, activity of neocortical and hippocampal neurons and the rabbits' behavior in the course of elaboration of defensive and inhibitory conditioned reflexes to light flashes have been recorded. Electric shock (ECS) applied to the paw served as reinforcement. The study demonstrated three types of reinforcement effect on the activity of cortical neurons: activating, disinhibitory, and inhibitory. EEG activation due to reinforcement is accompanied by a change in phasic cortical neuronal activity from chaotic or irregular, typical of rest or inhibition, to regular tonic discharges (in neocortex and hippocampus) and group discharges in the stress rhythm, 5-7 Hz in the hippocampus. Following a number of conditioning trials, the effect of reinforcement is simulated by the effect of a conditioned stimulus. With EEG activation and increased regularity in impulses, facilitation of motor reactions is observed.  相似文献   

20.
A previously described neural-network model (Desmond 1991; Desmond and Moore 1988; Moore et al. 1989) predicts that both CS-onset-evoked and CS-offset-evoked stimulus trace processes acquire associative strength during classical conditioning, and that CR waveforms can be altered by manipulating the time at which the processes are activated. In a trace conditioning paradigm, where CS offset precedes US onset, the model predicts that onset and offset traces act in synchrony to generate unimodal CR waveforms. However, if the CS duration is subsequently lengthened on CS-alone probe trials, the model predicts that onset and offset traces will asynchronously contribute to CR output and bimodal CRs will be generated. In a delay conditioning paradigm, in which US onset occurs prior to CS offset, the model predicts that only the onset process will gain associative strength, and hence, only unimodal CRs will occur. Using the rabbit conditioned nictitating membrane response preparation, we found experimental support for these predictions.This research was supported by National Science Foundation grant BNS 88-10624 and Air Force Office of Scientific Research grant 89-0391.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号