期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A neuromorphic architecture for object recognition and motion anticipation using burst-STDP

Nere A Olcese U Balduzzi D Tononi G 《PloS one》2012,7(5):e36958

In this work we investigate the possibilities offered by a minimal framework of artificial spiking neurons to be deployed in silico. Here we introduce a hierarchical network architecture of spiking neurons which learns to recognize moving objects in a visual environment and determine the correct motor output for each object. These tasks are learned through both supervised and unsupervised spike timing dependent plasticity (STDP). STDP is responsible for the strengthening (or weakening) of synapses in relation to pre- and post-synaptic spike times and has been described as a Hebbian paradigm taking place both in vitro and in vivo. We utilize a variation of STDP learning, called burst-STDP, which is based on the notion that, since spikes are expensive in terms of energy consumption, then strong bursting activity carries more information than single (sparse) spikes. Furthermore, this learning algorithm takes advantage of homeostatic renormalization, which has been hypothesized to promote memory consolidation during NREM sleep. Using this learning rule, we design a spiking neural network architecture capable of object recognition, motion detection, attention towards important objects, and motor control outputs. We demonstrate the abilities of our design in a simple environment with distractor objects, multiple objects moving concurrently, and in the presence of noise. Most importantly, we show how this neural network is capable of performing these tasks using a simple leaky-integrate-and-fire (LIF) neuron model with binary synapses, making it fully compatible with state-of-the-art digital neuromorphic hardware designs. As such, the building blocks and learning rules presented in this paper appear promising for scalable fully neuromorphic systems to be implemented in hardware chips. 相似文献

2.

Visual Saliency Models for Text Detection in Real World

Renwu Gao Seiichi Uchida Asif Shahab Faisal Shafait Volkmar Frinken 《PloS one》2014,9(12)

This paper evaluates the degree of saliency of texts in natural scenes using visual saliency models. A large scale scene image database with pixel level ground truth is created for this purpose. Using this scene image database and five state-of-the-art models, visual saliency maps that represent the degree of saliency of the objects are calculated. The receiver operating characteristic curve is employed in order to evaluate the saliency of scene texts, which is calculated by visual saliency models. A visualization of the distribution of scene texts and non-texts in the space constructed by three kinds of saliency maps, which are calculated using Itti''s visual saliency model with intensity, color and orientation features, is given. This visualization of distribution indicates that text characters are more salient than their non-text neighbors, and can be captured from the background. Therefore, scene texts can be extracted from the scene images. With this in mind, a new visual saliency architecture, named hierarchical visual saliency model, is proposed. Hierarchical visual saliency model is based on Itti''s model and consists of two stages. In the first stage, Itti''s model is used to calculate the saliency map, and Otsu''s global thresholding algorithm is applied to extract the salient region that we are interested in. In the second stage, Itti''s model is applied to the salient region to calculate the final saliency map. An experimental evaluation demonstrates that the proposed model outperforms Itti''s model in terms of captured scene texts. 相似文献

3.

A simple strategy for detecting moving objects during locomotion revealed by animal-robot interactions

Zabala F Polidoro P Robie A Branson K Perona P Dickinson MH 《Current biology : CB》2012,22(14):1344-1350

An important role of visual systems is to detect nearby predators, prey, and potential mates, which may be distinguished in part by their motion. When an animal is at rest, an object moving in any direction may easily be detected by motion-sensitive visual circuits. During locomotion, however, this strategy is compromised because the observer must detect a moving object within the pattern of optic flow created by its own motion through the stationary background. However, objects that move creating back-to-front (regressive) motion may be unambiguously distinguished from stationary objects because forward locomotion creates only front-to-back (progressive) optic flow. Thus, moving animals should exhibit an enhanced sensitivity to regressively moving objects. We explicitly tested this hypothesis by constructing a simple fly-sized robot that was programmed to interact with a real fly. Our measurements indicate that whereas walking female flies freeze in response to a regressively moving object, they ignore a progressively moving one. Regressive motion salience also explains observations of behaviors exhibited by pairs of walking flies. Because the assumptions underlying the regressive motion salience hypothesis are general, we suspect that the behavior we have observed in Drosophila may be widespread among eyed, motile organisms. 相似文献

4.

Computational model for perception of objects and motions

WenLu Yang LiQing Zhang LiBo Ma 《中国科学C辑(英文版)》2008,51(6):526-536

Perception of objects and motions in the visual scene is one of the basic problems in the visual system. There exist ‘What’ and ‘Where’ pathways in the superior visual cortex, starting from the simple cells in the primary visual cortex. The former is able to perceive objects such as forms, color, and texture, and the latter perceives ‘where’, for example, velocity and direction of spatial movement of objects. This paper explores brain-like computational architectures of visual information processing. We propose a visual perceptual model and computational mechanism for training the perceptual model. The computational model is a three-layer network. The first layer is the input layer which is used to receive the stimuli from natural environments. The second layer is designed for representing the internal neural information. The connections between the first layer and the second layer, called the receptive fields of neurons, are self-adaptively learned based on principle of sparse neural representation. To this end, we introduce Kullback-Leibler divergence as the measure of independence between neural responses and derive the learning algorithm based on minimizing the cost function. The proposed algorithm is applied to train the basis functions, namely receptive fields, which are localized, oriented, and bandpassed. The resultant receptive fields of neurons in the second layer have the characteristics resembling that of simple cells in the primary visual cortex. Based on these basis functions, we further construct the third layer for perception of what and where in the superior visual cortex. The proposed model is able to perceive objects and their motions with a high accuracy and strong robustness against additive noise. Computer simulation results in the final section show the feasibility of the proposed perceptual model and high efficiency of the learning algorithm. 相似文献

5.

Responses of tectal neurons to contrasting stimuli: an electrophysiological study in the barn owl

Zahar Y Wagner H Gutfreund Y 《PloS one》2012,7(6):e39559

The saliency of visual objects is based on the center to background contrast. Particularly objects differing in one feature from the background may be perceived as more salient. It is not clear to what extent this so called "pop-out" effect observed in humans and primates governs saliency perception in non-primates as well. In this study we searched for neural-correlates of pop-out perception in neurons located in the optic tectum of the barn owl. We measured the responses of tectal neurons to stimuli appearing within the visual receptive field, embedded in a large array of additional stimuli (the background). Responses were compared between contrasting and uniform conditions. In a contrasting condition the center was different from the background while in the uniform condition it was identical to the background. Most tectal neurons responded better to stimuli in the contrsating condition compared to the uniform condition when the contrast between center and background was the direction of motion but not when it was the orientation of a bar. Tectal neurons also preferred contrasting over uniform stimuli when the center was looming and the background receding but not when the center was receding and the background looming. Therefore, our results do not support the hypothesis that tectal neurons are sensitive to pop-out per-se. The specific sensitivity to the motion contrasting stimulus is consistent with the idea that object motion and not large field motion (e.g., self-induced motion) is coded in the neural responses of tectal neurons. 相似文献

6.

Interaction between Attention and Bottom-Up Saliency Mediates the Representation of Foreground and Background in an Auditory Scene

Mounya Elhilali Juanjuan Xiang Shihab A. Shamma Jonathan Z. Simon 《PLoS biology》2009,7(6)

The mechanism by which a complex auditory scene is parsed into coherent objects depends on poorly understood interactions between task-driven and stimulus-driven attentional processes. We illuminate these interactions in a simultaneous behavioral–neurophysiological study in which we manipulate participants' attention to different features of an auditory scene (with a regular target embedded in an irregular background). Our experimental results reveal that attention to the target, rather than to the background, correlates with a sustained (steady-state) increase in the measured neural target representation over the entire stimulus sequence, beyond auditory attention's well-known transient effects on onset responses. This enhancement, in both power and phase coherence, occurs exclusively at the frequency of the target rhythm, and is only revealed when contrasting two attentional states that direct participants' focus to different features of the acoustic stimulus. The enhancement originates in auditory cortex and covaries with both behavioral task and the bottom-up saliency of the target. Furthermore, the target's perceptual detectability improves over time, correlating strongly, within participants, with the target representation's neural buildup. These results have substantial implications for models of foreground/background organization, supporting a role of neuronal temporal synchrony in mediating auditory object formation. 相似文献

7.

Computational model for perception of objects and motions

YANG WenLu ZHANG LiQing & MA LiBo 《中国科学：生命科学英文版》2008,51(6):526-536

Perception of objects and motions in the visual scene is one of the basic problems in the visual system. There exist 'What' and 'Where' pathways in the superior visual cortex, starting from the simple cells in the primary visual cortex. The former is able to perceive objects such as forms, color, and texture, and the latter perceives 'where', for example, velocity and direction of spatial movement of objects. This paper explores brain-like computational architectures of visual information processing. We propose a visual perceptual model and computational mechanism for training the perceptual model. The compu- tational model is a three-layer network. The first layer is the input layer which is used to receive the stimuli from natural environments. The second layer is designed for representing the internal neural information. The connections between the first layer and the second layer, called the receptive fields of neurons, are self-adaptively learned based on principle of sparse neural representation. To this end, we introduce Kullback-Leibler divergence as the measure of independence between neural responses and derive the learning algorithm based on minimizing the cost function. The proposed algorithm is applied to train the basis functions, namely receptive fields, which are localized, oriented, and bandpassed. The resultant receptive fields of neurons in the second layer have the characteristics resembling that of simple cells in the primary visual cortex. Based on these basis functions, we further construct the third layer for perception of what and where in the superior visual cortex. The proposed model is able to perceive objects and their motions with a high accuracy and strong robustness against additive noise. Computer simulation results in the final section show the feasibility of the proposed perceptual model and high efficiency of the learning algorithm. 相似文献

8.

Structure-from-motion based on information at surface boundaries

William B. Thompson Daniel Kersten William R. Knecht 《Biological cybernetics》1992,66(4):327-333

Existing computational models of structurefrom-motion — the appearance of three-dimensional motion generated by moving two-dimensional patterns — are all based on variations of optical flow or feature point correspondence within the interior of single objects. Three separate phenomena provide strong evidence that in human vision, structure-from-motion is significantly affected by surface boundary cues. In the first, a rotating cylinder is seen, though no variation in optical flow exists across the apparent cylinder. In the second, the shape of the bounding contour of a moving pattern dominates the actual differential motion within the pattern. In the third, the appearance of independently moving objects changes significantly when the boundary between them becomes indistinct. We describe a simple computational model sufficient to account for these effects. The model is based on qualitative constraints relating possible object motions to patterns of flow, together with an understanding of the patterns of flow that can be discriminated in practice. 相似文献

9.

Neural mechanisms of auditory species recognition in birds

Matthew I. M. Louder Shelby Lawson Kathleen S. Lynch Christopher N. Balakrishnan Mark E. Hauber 《Biological reviews of the Cambridge Philosophical Society》2019,94(5):1619-1635

Auditory communication in humans and other animals frequently takes place in noisy environments with many co‐occurring signallers. Receivers are thus challenged to rapidly recognize salient auditory signals and filter out irrelevant sounds. Most bird species produce a variety of complex vocalizations that function to communicate with other members of their own species and behavioural evidence broadly supports preferences for conspecific over heterospecific sounds (auditory species recognition). However, it remains unclear whether such auditory signals are categorically recognized by the sensory and central nervous system. Here, we review 53 published studies that compare avian neural responses between conspecific versus heterospecific vocalizations. Irrespective of the techniques used to characterize neural activity, distinct nuclei of the auditory forebrain are consistently shown to be repeatedly conspecific selective across taxa, even in response to unfamiliar individuals with distinct acoustic properties. Yet, species‐specific neural discrimination is not a stereotyped auditory response, but is modulated according to its salience depending, for example, on ontogenetic exposure to conspecific versus heterospecific stimuli. Neuromodulators, in particular norepinephrine, may mediate species recognition by regulating the accuracy of neuronal coding for salient conspecific stimuli. Our review lends strong support for neural structures that categorically recognize conspecific signals despite the highly variable physical properties of the stimulus. The available data are in support of a ‘perceptual filter’‐based mechanism to determine the saliency of the signal, in that species identity and social experience combine to influence the neural processing of species‐specific auditory stimuli. Finally, we present hypotheses and their testable predictions, to propose next steps in species‐recognition research into the emerging model of the neural conceptual construct in avian auditory recognition. 相似文献

10.

Modulation of Visually Evoked Postural Responses by Contextual Visual,Haptic and Auditory Information: A ‘Virtual Reality Check’

Georg F. Meyer Fei Shao Mark D. White Carl Hopkins Antony J. Robotham 《PloS one》2013,8(6)

Externally generated visual motion signals can cause the illusion of self-motion in space (vection) and corresponding visually evoked postural responses (VEPR). These VEPRs are not simple responses to optokinetic stimulation, but are modulated by the configuration of the environment. The aim of this paper is to explore what factors modulate VEPRs in a high quality virtual reality (VR) environment where real and virtual foreground objects served as static visual, auditory and haptic reference points. Data from four experiments on visually evoked postural responses show that: 1) visually evoked postural sway in the lateral direction is modulated by the presence of static anchor points that can be haptic, visual and auditory reference signals; 2) real objects and their matching virtual reality representations as visual anchors have different effects on postural sway; 3) visual motion in the anterior-posterior plane induces robust postural responses that are not modulated by the presence of reference signals or the reality of objects that can serve as visual anchors in the scene. We conclude that automatic postural responses for laterally moving visual stimuli are strongly influenced by the configuration and interpretation of the environment and draw on multisensory representations. Different postural responses were observed for real and virtual visual reference objects. On the basis that automatic visually evoked postural responses in high fidelity virtual environments should mimic those seen in real situations we propose to use the observed effect as a robust objective test for presence and fidelity in VR. 相似文献

11.

Object selection by an oscillatory neural network

Kazanovich Y Borisyuk R 《Bio Systems》2002,67(1-3):103-111

We describe a new solution to the problem of consecutive selection of objects in a visual scene by an oscillatory neural network with the global interaction realised through a central executive element (central oscillator). The frequency coding is used to represent greyscale images in the network. The functioning of the network is based on three main principles: (1) the synchronisation of oscillators via phase-locking, (2) adaptation of the natural frequency of the central oscillator, and (3) resonant increase of the amplitudes of the oscillators which work in-phase with the central oscillator. Examples of network simulations are presented to show the reliability of the results of consecutive selection of objects under conditions of constant and varying brightness of the objects. 相似文献

12.

Decoding successive computational stages of saliency processing

Bogler C Bode S Haynes JD 《Current biology : CB》2011,21(19):1667-1671

An important requirement for vision is to identify interesting and relevant regions of the environment for further processing. Some models assume that salient locations from a visual scene are encoded in a dedicated spatial saliency map [1, 2]. Then, a winner-take-all (WTA) mechanism [1, 2] is often believed to threshold the graded saliency representation and identify the most salient position in the visual field. Here we aimed to assess whether neural representations of graded saliency and the subsequent WTA mechanism can be dissociated. We presented images of natural scenes while subjects were in a scanner performing a demanding fixation task, and thus their attention was directed away. Signals in early visual cortex and posterior intraparietal sulcus (IPS) correlated with graded saliency as defined by a computational saliency model. Multivariate pattern classification [3, 4] revealed that the most salient position in the visual field was encoded in anterior IPS and frontal eye fields (FEF), thus reflecting a potential WTA stage. Our results thus confirm that graded saliency and WTA-thresholded saliency are encoded in distinct neural structures. This could provide the neural representation required for rapid and automatic orientation toward salient events in natural environments. 相似文献

13.

A simple vision-based algorithm for decision making in flying Drosophila

Maimon G Straw AD Dickinson MH 《Current biology : CB》2008,18(6):464-470

Animals must quickly recognize objects in their environment and act accordingly. Previous studies indicate that looming visual objects trigger avoidance reflexes in many species [1-5]; however, such reflexes operate over a close range and might not detect a threatening stimulus at a safe distance. We analyzed how fruit flies (Drosophila melanogaster) respond to simple visual stimuli both in free flight and in a tethered-flight simulator. Whereas Drosophila, like many other insects, are attracted toward long vertical objects [6-10], we found that smaller visual stimuli elicit not weak attraction but rather strong repulsion. Because aversion to small spots depends on the vertical size of a moving object, and not on looming, it can function at a much greater distance than expansion-dependent reflexes. The opposing responses to long stripes and small spots reflect a simple but effective object classification system. Attraction toward long stripes would lead flies toward vegetative perches or feeding sites, whereas repulsion from small spots would help them avoid aerial predators or collisions with other insects. The motion of flying Drosophila depends on a balance of these two systems, providing a foundation for studying the neural basis of behavioral choice in a genetic model organism. 相似文献

14.

Associative Processing Is Inherent in Scene Perception

Elissa M. Aminoff Michael J. Tarr 《PloS one》2015,10(6)

How are complex visual entities such as scenes represented in the human brain? More concretely, along what visual and semantic dimensions are scenes encoded in memory? One hypothesis is that global spatial properties provide a basis for categorizing the neural response patterns arising from scenes. In contrast, non-spatial properties, such as single objects, also account for variance in neural responses. The list of critical scene dimensions has continued to grow—sometimes in a contradictory manner—coming to encompass properties such as geometric layout, big/small, crowded/sparse, and three-dimensionality. We demonstrate that these dimensions may be better understood within the more general framework of associative properties. That is, across both the perceptual and semantic domains, features of scene representations are related to one another through learned associations. Critically, the components of such associations are consistent with the dimensions that are typically invoked to account for scene understanding and its neural bases. Using fMRI, we show that non-scene stimuli displaying novel associations across identities or locations recruit putatively scene-selective regions of the human brain (the parahippocampal/lingual region, the retrosplenial complex, and the transverse occipital sulcus/occipital place area). Moreover, we find that the voxel-wise neural patterns arising from these associations are significantly correlated with the neural patterns arising from everyday scenes providing critical evidence whether the same encoding principals underlie both types of processing. These neuroimaging results provide evidence for the hypothesis that the neural representation of scenes is better understood within the broader theoretical framework of associative processing. In addition, the results demonstrate a division of labor that arises across scene-selective regions when processing associations and scenes providing better understanding of the functional roles of each region within the cortical network that mediates scene processing. 相似文献

15.

Binding and segmentation of multiple objects through neural oscillators inhibited by contour information

Ursino M La Cara GE Sarti A 《Biological cybernetics》2003,89(1):56-70

Temporal correlation of neuronal activity has been suggested as a criterion for multiple object recognition. In this work, a two-dimensional network of simplified Wilson-Cowan oscillators is used to manage the binding and segmentation problem of a visual scene according to the connectedness Gestalt criterion. Binding is achieved via original coupling terms that link excitatory units to both excitatory and inhibitory units of adjacent neurons. These local coupling terms are time independent, i.e., they do not require Hebbian learning during the simulations. Segmentation is realized by a two-layer processing of the visual image. The first layer extracts all object contours from the image by means of “retinal cells” with an “on-center” receptive field. Information on contour is used to selectively inhibit Wilson-Cowan oscillators in the second layer, thus realizing a strong separation among neurons in different objects. Accidental synchronism between oscillations in different objects is prevented with the use of a global inhibitor, i.e., a global neuron that computes the overall activity in the Wilson-Cowan network and sends back an inhibitory signal. Simulations performed in a 50×50 neural grid with 21 different visual scenes (containing up to eight objects + background) with random initial conditions demonstrate that the network can correctly segment objects in almost 100% of cases using a single set of parameters, i.e., without the need to adjust parameters from one visual scene to the next. The network is robust with reference to dynamical noise superimposed on oscillatory neurons. Moreover, the network can segment both black objects on white background and vice versa and is able to deal with the problem of “fragmentation.” The main limitation of the network is its sensitivity to static noise superimposed on the objects. Overcoming this problem requires implementation of more robust mechanisms for contour enhancement in the first layer in agreement with mechanisms actually realized in the visual cortex. Received: 25 October 2001 / Accepted: 26 February 2003 / Published online: 20 May 2003 Correspondence to: Mauro Ursino (e-mail: mursino@deis.unibo.it, Tel.: +39-051-2093008, Fax: +39-051-2093073) 相似文献

16.

Object segmentation model: analytical results and biological implications

Woesler R 《Biological cybernetics》2001,85(3):203-210

A simple, biologically motivated neural network for segmentation of a moving object from a visual scene is presented. The model consists of two parts: an object selection model which employs a scaling approach for receptive field sizes, and a subsequent network implementing a spotlight by means of multiplicative synapses. The network selects one object out of several, segments the rough contour of the object, and encodes the winner object's position with high accuracy. Analytical equations for the performance level of the network, e.g., for the critical distance of two objects above which they are perceived as separate, are derived. The network preferentially chooses the object with the largest angular velocity and the largest angular width. An equation for the velocity and width preferences is presented. Additionally it is shown that for certain neurons of the model, flat receptive fields are more favourable than Gaussian ones. The network exhibits performances similar to those known from amphibians. Various electrophysiological and behavioral results – e.g., the distribution of the diameters of the receptive fields of tectal neurons, of the tongue-projecting salamander Hydromantes italicus and the range of optimal prey velocities for prey catching – can be understood on the basis of the model. Received: 7 December 2000 / Accepted: 13 February 2001 相似文献

17.

Direct evidence for encoding of motion streaks in human visual cortex

Deborah Apthorp D. Samuel Schwarzkopf Christian Kaul Bahador Bahrami David Alais Geraint Rees 《Proceedings. Biological sciences / The Royal Society》2013,280(1752)

Temporal integration in the visual system causes fast-moving objects to generate static, oriented traces (‘motion streaks’), which could be used to help judge direction of motion. While human psychophysics and single-unit studies in non-human primates are consistent with this hypothesis, direct neural evidence from the human cortex is still lacking. First, we provide psychophysical evidence that faster and slower motions are processed by distinct neural mechanisms: faster motion raised human perceptual thresholds for static orientations parallel to the direction of motion, whereas slower motion raised thresholds for orthogonal orientations. We then used functional magnetic resonance imaging to measure brain activity while human observers viewed either fast (‘streaky’) or slow random dot stimuli moving in different directions, or corresponding static-oriented stimuli. We found that local spatial patterns of brain activity in early retinotopic visual cortex reliably distinguished between static orientations. Critically, a multivariate pattern classifier trained on brain activity evoked by these static stimuli could then successfully distinguish the direction of fast (‘streaky’) but not slow motion. Thus, signals encoding static-oriented streak information are present in human early visual cortex when viewing fast motion. These experiments show that motion streaks are present in the human visual system for faster motion. 相似文献

18.

Motion Noise Changes Directional Interaction between Transparently Moving Stimuli from Repulsion to Attraction

Jennifer L. Gaudio Xin Huang 《PloS one》2012,7(11)

To interpret visual scenes, visual systems need to segment or integrate multiple moving features into distinct objects or surfaces. Previous studies have found that the perceived direction separation between two transparently moving random-dot stimuli is wider than the actual direction separation. This perceptual “direction repulsion” is useful for segmenting overlapping motion vectors. Here we investigate the effects of motion noise on the directional interaction between overlapping moving stimuli. Human subjects viewed two overlapping random-dot patches moving in different directions and judged the direction separation between the two motion vectors. We found that the perceived direction separation progressively changed from wide to narrow as the level of motion noise in the stimuli was increased, showing a switch from direction repulsion to attraction (i.e. smaller than the veridical direction separation). We also found that direction attraction occurred at a wider range of direction separations than direction repulsion. The normalized effects of both direction repulsion and attraction were the strongest near the direction separation of ∼25° and declined as the direction separation further increased. These results support the idea that motion noise prompts motion integration to overcome stimulus ambiguity. Our findings provide new constraints on neural models of motion transparency and segmentation. 相似文献

19.

Robust Single Trial Identification of Conscious Percepts Triggered by Sensory Events of Variable Saliency

Marta Teixeira Gabriel Pires Miguel Raimundo Sérgio Nascimento Vasco Almeida Miguel Castelo-Branco 《PloS one》2014,9(1)

The neural correlates of visual awareness are elusive because of its fleeting nature. Here we have addressed this issue by using single trial statistical “brain reading” of neurophysiological event related (ERP) signatures of conscious perception of visual attributes with different levels of saliency. Behavioral reports were taken at every trial in 4 experiments addressing conscious access to color, luminance, and local phase offset cues. We found that single trial neurophysiological signatures of target presence can be observed around 300 ms at central parietal sites. Such signatures are significantly related with conscious perception, and their probability is related to sensory saliency levels. These findings identify a general neural correlate of conscious perception at the single trial level, since conscious perception can be decoded as such independently of stimulus salience and fluctuations of threshold levels. This approach can be generalized to successfully detect target presence in other individuals. 相似文献

20.

Acoustic facilitation of object movement detection during self-motion

Calabro FJ Soto-Faraco S Vaina LM 《Proceedings. Biological sciences / The Royal Society》2011,278(1719):2840-2847

In humans, as well as most animal species, perception of object motion is critical to successful interaction with the surrounding environment. Yet, as the observer also moves, the retinal projections of the various motion components add to each other and extracting accurate object motion becomes computationally challenging. Recent psychophysical studies have demonstrated that observers use a flow-parsing mechanism to estimate and subtract self-motion from the optic flow field. We investigated whether concurrent acoustic cues for motion can facilitate visual flow parsing, thereby enhancing the detection of moving objects during simulated self-motion. Participants identified an object (the target) that moved either forward or backward within a visual scene containing nine identical textured objects simulating forward observer translation. We found that spatially co-localized, directionally congruent, moving auditory stimuli enhanced object motion detection. Interestingly, subjects who performed poorly on the visual-only task benefited more from the addition of moving auditory stimuli. When auditory stimuli were not co-localized to the visual target, improvements in detection rates were weak. Taken together, these results suggest that parsing object motion from self-motion-induced optic flow can operate on multisensory object representations. 相似文献