首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this article we review current literature on cross-modal recognition and present new findings from our studies on object and scene recognition. Specifically, we address the questions of what is the nature of the representation underlying each sensory system that facilitates convergence across the senses and how perception is modified by the interaction of the senses. In the first set of our experiments, the recognition of unfamiliar objects within and across the visual and haptic modalities was investigated under conditions of changes in orientation (0 degrees or 180 degrees ). An orientation change increased recognition errors within each modality but this effect was reduced across modalities. Our results suggest that cross-modal object representations of objects are mediated by surface-dependent representations. In a second series of experiments, we investigated how spatial information is integrated across modalities and viewpoint using scenes of familiar, 3D objects as stimuli. We found that scene recognition performance was less efficient when there was either a change in modality, or in orientation, between learning and test. Furthermore, haptic learning was selectively disrupted by a verbal interpolation task. Our findings are discussed with reference to separate spatial encoding of visual and haptic scenes. We conclude by discussing a number of constraints under which cross-modal integration is optimal for object recognition. These constraints include the nature of the task, and the amount of spatial and temporal congruency of information across the modalities.  相似文献   

2.
How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.  相似文献   

3.
Recently Poggio and Edelman have shown that for each object there exists a smooth mapping from an arbitrary view to its standard view and that the mapping can be learned from a sparse data set. In this paper, we extend their scheme further to deal with 3D flexible objects. We show the mappings from an arbitrary view to the standard view, and its rotated view can be synthesized even for a flexible object by learning from examples. To classify 3D flexible objects, we propose two methods, which do not require any special knowledge on the target flexible objects. They are: (1) learning the characteristic function of the object and (2) learning the view-change transformation. We show their performance by computer simulations. Received: 1 March 1993/Accepted in revised form: 7 June 1993  相似文献   

4.
Current accounts of spatial cognition and human-object interaction suggest that the representation of peripersonal space depends on an action-specific system that remaps its representation according to action requirements. Here we demonstrate that this mechanism is sensitive to knowledge about properties of objects. In two experiments we explored the interaction between physical distance and object attributes (functionality, desirability, graspability, etc.) through a reaching estimation task in which participants indicated if objects were near enough to be reached. Using both a real and a cutting-edge digital scenario, we demonstrate that perceived reaching distance is influenced by ease of grasp and the affective valence of an object. Objects with a positive affective valence tend to be perceived reachable at locations at which neutral or negative objects are perceived as non-reachable. In addition to this, reaction time to distant (non-reachable) positive objects suggests a bias to perceive positive objects as closer than negative and neutral objects (exp. 2). These results highlight the importance of the affective valence of objects in the action-specific mapping of the peripersonal/extrapersonal space system.  相似文献   

5.
The cerebral cortex utilizes spatiotemporal continuity in the world to help build invariant representations. In vision, these might be representations of objects. The temporal continuity typical of objects has been used in an associative learning rule with a short-term memory trace to help build invariant object representations. In this paper, we show that spatial continuity can also provide a basis for helping a system to self-organize invariant representations. We introduce a new learning paradigm “continuous transformation learning” which operates by mapping spatially similar input patterns to the same postsynaptic neurons in a competitive learning system. As the inputs move through the space of possible continuous transforms (e.g. translation, rotation, etc.), the active synapses are modified onto the set of postsynaptic neurons. Because other transforms of the same stimulus overlap with previously learned exemplars, a common set of postsynaptic neurons is activated by the new transforms, and learning of the new active inputs onto the same postsynaptic neurons is facilitated. We demonstrate that a hierarchical model of cortical processing in the ventral visual system can be trained with continuous transform learning, and highlight differences in the learning of invariant representations to those achieved by trace learning.  相似文献   

6.
Fragment-based learning of visual object categories   总被引:2,自引:0,他引:2  
When we perceive a visual object, we implicitly or explicitly associate it with a category we know. It is known that the visual system can use local, informative image fragments of a given object, rather than the whole object, to classify it into a familiar category. How we acquire informative fragments has remained unclear. Here, we show that human observers acquire informative fragments during the initial learning of categories. We created new, but naturalistic, classes of visual objects by using a novel "virtual phylogenesis" (VP) algorithm that simulates key aspects of how biological categories evolve. Subjects were trained to distinguish two of these classes by using whole exemplar objects, not fragments. We hypothesized that if the visual system learns informative object fragments during category learning, then subjects must be able to perform the newly learned categorization by using only the fragments as opposed to whole objects. We found that subjects were able to successfully perform the classification task by using each of the informative fragments by itself, but not by using any of the comparable, but uninformative, fragments. Our results not only reveal that novel categories can be learned by discovering informative fragments but also introduce and illustrate the use of VP as a versatile tool for category-learning research.  相似文献   

7.
A common method for testing preference for objects is to determine which of a pair of objects is approached first in a paired-choice paradigm. In comparison, many studies of preference for environmental enrichment (EE) devices have used paradigms in which total time spent with each of a pair of objects is used to determine preference. While each of these paradigms gives a specific measure of the preference for one object in comparison to another, neither method allows comparisons between multiple objects simultaneously. Since it is possible that several EE objects would be placed in a cage together to improve animal welfare, it is important to determine measures for rats’ preferences in conditions that mimic this potential home cage environment. While it would be predicted that each type of measure would produce similar rankings of objects, this has never been tested empirically. In this study, we compared two paradigms: EE objects were either presented in pairs (paired-choice comparison) or four objects were presented simultaneously (simultaneous presentation comparison). We used frequency of first interaction and time spent with each object to rank the objects in the paired-choice experiment, and time spent with each object to rank the objects in the simultaneous presentation experiment. We also considered the behaviours elicited by the objects to determine if these might be contributing to object preference. We demonstrated that object ranking based on time spent with objects from the paired-choice experiment predicted object ranking in the simultaneous presentation experiment. Additionally, we confirmed that behaviours elicited were an important determinant of time spent with an object. This provides convergent evidence that both paired choice and simultaneous comparisons provide valid measures of preference for EE objects in rats.  相似文献   

8.
This paper introduces a new approach to assess visual representations underlying the recognition of objects. Human performance is modeled by CLARET, a machine learning and matching system, based on inductive logic programming and graph matching principles. The model is applied to data of a learning experiment addressing the role of prior experience in the ontogenesis of mental object representations. Prior experience was varied in terms of sensory modality, i.e. visual versus haptic versus visuohaptic. The analysis revealed distinct differences between the representational formats used by subjects with haptic versus those with no prior object experience. These differences suggest that prior haptic exploration stimulates the evolution of object representations which are characterized by an increased differentiation between attribute values and a pronounced structural encoding.  相似文献   

9.
We investigated the presence of a key feature of human word comprehension in a five year old Border Collie: the generalization of a word referring to an object to other objects of the same shape, also known as shape bias. Our first experiment confirmed a solid history of word learning in the dog, thus making it possible for certain object features to have become central in his word comprehension. Using an experimental paradigm originally employed to establish shape bias in children and human adults we taught the dog arbitrary object names (e.g. dax) for novel objects. Two experiments showed that when briefly familiarized with word-object mappings the dog did not generalize object names to object shape but to object size. A fourth experiment showed that when familiarized with a word-object mapping for a longer period of time the dog tended to generalize the word to objects with the same texture. These results show that the dog tested did not display human-like word comprehension, but word generalization and word reference development of a qualitatively different nature compared to humans. We conclude that a shape bias for word generalization in humans is due to the distinct evolutionary history of the human sensory system for object identification and that more research is necessary to confirm qualitative differences in word generalization between humans and dogs.  相似文献   

10.
Standards such as CORBA are spreading in the development of large scale projects. However, CORBA lacks a mobility mechanism which is an interesting feature to deal with system dynamics. In this paper, we propose a generic solution for object mobility in CORBA in the framework of its lifecycle service. Implementation at the object level handles the migration process using intermediary objects. A group mechanism is used to manage the object creation infrastructure so as to allow scalability. We have chosen a multi-agent auto-organizational group mechanism to reduce the administrative task for a large system. Performance tests show that reasonable performance can be achieved using a high level generic and portable implementation. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

11.
Active exploration of large-scale environments leads to better learning of spatial layout than does passive observation [1] [2] [3]. But active exploration might also help us to remember the appearance of individual objects in a scene. In fact, when we encounter new objects, we often manipulate them so that they can be seen from a variety of perspectives. We present here the first evidence that active control of the visual input in this way facilitates later recognition of objects. Observers who actively rotated novel, three-dimensional objects on a computer screen later showed more efficient visual recognition than observers who passively viewed the exact same sequence of images of these virtual objects. During active exploration, the observers focused mainly on the 'side' or 'front' views of the objects (see also [4] [5] [6]). The results demonstrate that how an object is represented for later recognition is influenced by whether or not one controls the presentation of visual input during learning.  相似文献   

12.
Distinct cerebral pathways for object identity and number in human infants   总被引:1,自引:1,他引:0  
All humans, regardless of their culture and education, possess an intuitive understanding of number. Behavioural evidence suggests that numerical competence may be present early on in infancy. Here, we present brain-imaging evidence for distinct cerebral coding of number and object identity in 3-mo-old infants. We compared the visual event-related potentials evoked by unforeseen changes either in the identity of objects forming a set, or in the cardinal of this set. In adults and 4-y-old children, number sense relies on a dorsal system of bilateral intraparietal areas, different from the ventral occipitotemporal system sensitive to object identity. Scalp voltage topographies and cortical source modelling revealed a similar distinction in 3-mo-olds, with changes in object identity activating ventral temporal areas, whereas changes in number involved an additional right parietoprefrontal network. These results underscore the developmental continuity of number sense by pointing to early functional biases in brain organization that may channel subsequent learning to restricted brain areas.  相似文献   

13.
Regularities are gradually represented in cortex after extensive experience [1], and yet they can influence behavior after minimal exposure [2, 3]. What kind of representations support such rapid statistical learning? The medial temporal lobe (MTL) can represent information from even a single experience [4], making it a good candidate system for assisting in initial learning about regularities. We combined anatomical segmentation of the MTL, high-resolution fMRI, and multivariate pattern analysis to identify representations of objects in cortical and hippocampal areas of human MTL, assessing how these representations were shaped by exposure to regularities. Subjects viewed a continuous visual stream containing hidden temporal relationships-pairs of objects that reliably appeared nearby in time. We compared the pattern of blood oxygen level-dependent activity evoked by each object before and after this exposure, and found that perirhinal cortex, parahippocampal cortex, subiculum, CA1, and CA2/CA3/dentate gyrus (CA2/3/DG) encoded regularities by increasing the representational similarity of their constituent objects. Most regions exhibited bidirectional associative shaping, whereas CA2/3/DG represented regularities in a forward-looking predictive manner. These findings suggest that object representations in MTL come to mirror the temporal structure of the environment, supporting rapid and incidental statistical learning.  相似文献   

14.
Object detection in the fly during simulated translatory flight   总被引:1,自引:0,他引:1  
Translatory movement of an animal in its environment induces optic flow that contains information about the three-dimensional layout of the surroundings: as a rule, images of objects that are closer to the animal move faster across the retina than those of more distant objects. Such relative motion cues are used by flies to detect objects in front of a structured background. We confronted flying flies, tethered to a torque meter, with front-to-back motion of patterns displayed on two CRT screens, thereby simulating translatory motion of the background as experienced by an animal during straight flight. The torque meter measured the instantaneous turning responses of the fly around its vertical body axis. During short time intervals, object motion was superimposed on background pattern motion. The average turning response towards such an object depends on both object and background velocity in a characteristic way: (1) in order to elicit significant responses object motion has to be faster than background motion; (2) background motion within a certain range of velocities improves object detection. These properties can be interpreted as adaptations to situations as they occur in natural free flight. We confirmed that the measured responses were mediated mainly by a control system specialized for the detection of objects rather than by the compensatory optomotor system responsible for course stabilization. Accepted: 20 March 1997  相似文献   

15.
Recently we introduced a new version of the perceptual retouch model incorporating two interactive binding operations—binding features for objects and binding the bound feature-objects with a large scale oscillatory system that acts as a mediary for the perceptual information to reach consciousness-level representation. The relative level of synchronized firing of the neurons representing the features of an object obtained after the second-stage synchronizing modulation is used as the equivalent of conscious perception of the corresponding object. Here, this model is used for simulating interaction of two successive featured objects as a function of stimulus onset asynchrony (SOA). Model output reproduces typical results of mutual masking—with shortest and longest SOAs first and second object correct perception rate is comparable while with intermediate SOAs second object dominates over the first one. Additionally, with shortest SOAs misbinding of features to form illusory objects is simulated by the model.  相似文献   

16.
When we perceive a visual object, we implicitly or explicitly associate it with an object category we know. Recent research has shown that the visual system can use local, informative image fragments of a given object, rather than the whole object, to classify it into a familiar category. We have previously reported, using human psychophysical studies, that when subjects learn new object categories using whole objects, they incidentally learn informative fragments, even when not required to do so. However, the neuronal mechanisms by which we acquire and use informative fragments, as well as category knowledge itself, have remained unclear. Here we describe the methods by which we adapted the relevant human psychophysical methods to awake, behaving monkeys and replicated key previous psychophysical results. This establishes awake, behaving monkeys as a useful system for future neurophysiological studies not only of informative fragments in particular, but also of object categorization and category learning in general.  相似文献   

17.
Over successive stages, the ventral visual system of the primate brain develops neurons that respond selectively to particular objects or faces with translation, size and view invariance. The powerful neural representations found in Inferotemporal cortex form a remarkably rapid and robust basis for object recognition which belies the difficulties faced by the system when learning in natural visual environments. A central issue in understanding the process of biological object recognition is how these neurons learn to form separate representations of objects from complex visual scenes composed of multiple objects. We show how a one-layer competitive network comprised of ‘spiking’ neurons is able to learn separate transformation-invariant representations (exemplified by one-dimensional translations) of visual objects that are always seen together moving in lock-step, but separated in space. This is achieved by combining ‘Mexican hat’ functional lateral connectivity with cell firing-rate adaptation to temporally segment input representations of competing stimuli through anti-phase oscillations (perceptual cycles). These spiking dynamics are quickly and reliably generated, enabling selective modification of the feed-forward connections to neurons in the next layer through Spike-Time-Dependent Plasticity (STDP), resulting in separate translation-invariant representations of each stimulus. Variations in key properties of the model are investigated with respect to the network’s ability to develop appropriate input representations and subsequently output representations through STDP. Contrary to earlier rate-coded models of this learning process, this work shows how spiking neural networks may learn about more than one stimulus together without suffering from the ‘superposition catastrophe’. We take these results to suggest that spiking dynamics are key to understanding biological visual object recognition.  相似文献   

18.
19.
Summary To investigate scene segmentation in the visual system we present a model of two reciprocally connected visual areas comprising spiking neurons. The peripheral area P is modeled similar to the primary visual cortex, while the central area C is modeled as an associative memory representing stimulus objects according to Hebbian learning. Without feedback from area C, spikes corresponding to stimulus representations in P are synchronized only locally (slow state). Feedback from C can induce fast oscillations and an increase of synchronization ranges (fast state). Presenting a superposition of several stimulus objects, scene segmentation happens on a time scale of hundreds of milliseconds by alternating epochs of the slow and fast state, where neurons representing the same object are simultaneously in the fast state. We relate our simulation results to various phenomena observed in neurophysiological experiments, such as stimulus-dependent synchronization of fast oscillations, synchronization on different time scales, ongoing activity, and attention-dependent neural activity.  相似文献   

20.
We propose a fish detection system based on deep network architectures to robustly detect and count fish objects under a variety of benthic background and illumination conditions. The algorithm consists of an ensemble of Region-based Convolutional Neural Networks that are linked in a cascade structure by Long Short-Term Memory networks. The proposed network is efficiently trained as all components are jointly trained by backpropagation. We train and test our system for a dataset of 18 videos taken in the wild. In our dataset, there are around 20 to 100 fish objects per frame with many fish objects having small pixel areas (less than 900 square pixels). From a series of experiments and ablation tests, the proposed system preserves detection accuracy despite multi-scale distortions, cropping and varying background environments. We present analysis that shows how object localization accuracy is increased by an automatic correction mechanism in the deep network's cascaded ensemble structure. The correction mechanism rectifies any errors in the predictions as information progresses through the network cascade. Our findings in this experiment regarding ensemble system architectures can be generalized to other object detection applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号