首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Many saliency computational models have been proposed to simulate bottom-up visual attention mechanism of human visual system. However, most of them only deal with certain kinds of images or aim at specific applications. In fact, human beings have the ability to correctly select attentive focuses of objects with arbitrary sizes within any scenes. This paper proposes a new bottom-up computational model from the perspective of frequency domain based on the biological discovery of non-Classical Receptive Field (nCRF) in the retina. A saliency map can be obtained according to the idea of Extended Classical Receptive Field. The model is composed of three major steps: firstly decompose the input image into several feature maps representing different frequency bands that cover the whole frequency domain by utilizing Gabor wavelet. Secondly, whiten the feature maps to highlight the embedded saliency information. Thirdly, select some optimal maps, simulating the response of receptive field especially nCRF, to generate the saliency map. Experimental results show that the proposed algorithm is able to work with stable effect and outstanding performance in a variety of situations as human beings do and is adaptive to both psychological patterns and natural images. Beyond that, biological plausibility of nCRF and Gabor wavelet transform make this approach reliable.  相似文献   

2.
Recent studies have shown that human perception of body ownership is highly malleable. A well-known example is the rubber hand illusion (RHI) wherein ownership over a dummy hand is experienced, and is generally believed to require synchronized stroking of real and dummy hands. Our goal was to elucidate the computational principles governing this phenomenon. We adopted the Bayesian causal inference model of multisensory perception and applied it to visual, proprioceptive, and tactile stimuli. The model reproduced the RHI, predicted that it can occur without tactile stimulation, and that synchronous stroking would enhance it. Various measures of ownership across two experiments confirmed the predictions: a large percentage of individuals experienced the illusion in the absence of any tactile stimulation, and synchronous stroking strengthened the illusion. Altogether, these findings suggest that perception of body ownership is governed by Bayesian causal inference—i.e., the same rule that appears to govern the perception of outside world.  相似文献   

3.
A major issue in cortical physiology and computational neuroscience is understanding the interaction between extrinsic signals from feedforward connections and intracortical signals from lateral connections. We propose here a computational model for motion perception based on the assumption that the local cortical circuits in the medio-temporal area (area MT) implement a Bayesian inference principle. This approach establishes a functional balance between feedforward and lateral, excitatory and inhibitory, inputs. The model reproduces most of the known properties of the neurons in area MT in response to moving stimuli. It accounts for important motion perception phenomena including motion transparency, spatial and temporal integration/segmentation. While integrating several properties of previously proposed models, it makes specific testable predictions concerning, in particular, temporal properties of neurons and the architecture of lateral connections in area MT. In addition, the proposed mechanism is consistent with the known properties of local cortical circuits in area V1. This suggests that Bayesian inference may be a general feature of information processing in cortical neuron populations. Received: 3 December 1997 / Accepted in revised form: 21 July 1998  相似文献   

4.
A multilayer neural nerwork model for the perception of rotational motion has been developed usingReichardt's motion detector array of correlation type, Kohonen's self-organized feature map and Schuster-Wagner's oscillating neural network. It is shown that the unsupervised learning could make the neurons on the second layer of the network tend to be self-organized in a form resembling columnar organization of selective directions in area MT of the primate's visual cortex. The output layer can interpret rotation information and give the directions and velocities of rotational motion. The computer simulation results are in agreement with some psychophysical observations of rotation-al perception. It is demonstrated that the temporal correlation between the oscillating neurons would be powerful for solving the "binding problem" of shear components of rotational motion.  相似文献   

5.

Background

Visual neglect is an attentional deficit typically resulting from parietal cortex lesion and sometimes frontal lesion. Patients fail to attend to objects and events in the visual hemifield contralateral to their lesion during visual search.

Methodology/Principal Finding

The aim of this work was to examine the effects of parietal and frontal lesion in an existing computational model of visual attention and search and simulate visual search behaviour under lesion conditions. We find that unilateral parietal lesion in this model leads to symptoms of visual neglect in simulated search scan paths, including an inhibition of return (IOR) deficit, while frontal lesion leads to milder neglect and to more severe deficits in IOR and perseveration in the scan path. During simulations of search under unilateral parietal lesion, the model''s extrastriate ventral stream area exhibits lower activity for stimuli in the neglected hemifield compared to that for stimuli in the normally perceived hemifield. This could represent a computational correlate of differences observed in neuroimaging for unconscious versus conscious perception following parietal lesion.

Conclusions/Significance

Our results lead to the prediction, supported by effective connectivity evidence, that connections between the dorsal and ventral visual streams may be an important factor in the explanation of perceptual deficits in parietal lesion patients and of conscious perception in general.  相似文献   

6.

Background

The timing at which sensory input reaches the level of conscious perception is an intriguing question still awaiting an answer. It is often assumed that both visual and auditory percepts have a modality specific processing delay and their difference determines perceptual temporal offset.

Methodology/Principal Findings

Here, we show that the perception of audiovisual simultaneity can change flexibly and fluctuates over a short period of time while subjects observe a constant stimulus. We investigated the mechanisms underlying the spontaneous alternations in this audiovisual illusion and found that attention plays a crucial role. When attention was distracted from the stimulus, the perceptual transitions disappeared. When attention was directed to a visual event, the perceived timing of an auditory event was attracted towards that event.

Conclusions/Significance

This multistable display illustrates how flexible perceived timing can be, and at the same time offers a paradigm to dissociate perceptual from stimulus-driven factors in crossmodal feature binding. Our findings suggest that the perception of crossmodal synchrony depends on perceptual binding of audiovisual stimuli as a common event.  相似文献   

7.
Watching a speaker''s facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness), one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.  相似文献   

8.

Background

In the human visual system, different attributes of an object, such as shape, color, and motion, are processed separately in different areas of the brain. This raises a fundamental question of how are these attributes integrated to produce a unified perception and a specific response. This “binding problem” is computationally difficult because all attributes are assumed to be bound together to form a single object representation. However, there is no firm evidence to confirm that such representations exist for general objects.

Methodology/Principal Findings

Here we propose a paired-attribute model in which cognitive processes are based on multiple representations of paired attributes. In line with the model''s prediction, we found that multiattribute stimuli can produce an illusory perception of a multiattribute object arising from erroneous integration of attribute pairs, implying that object recognition is based on parallel perception of paired attributes. Moreover, in a change-detection task, a feature change in a single attribute frequently caused an illusory perception of change in another attribute, suggesting that multiple pairs of attributes are stored in memory.

Conclusions/Significance

The paired-attribute model can account for some novel illusions and controversial findings on binocular rivalry and short-term memory. Our results suggest that many cognitive processes are performed at the level of paired attributes rather than integrated objects, which greatly facilitates the binding problem and provides simpler solutions for it.  相似文献   

9.
To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the “causal inference problem.” Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI), and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation). At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion). Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.  相似文献   

10.
A generalized quantum theoretical framework, not restricted to the validity domain of standard quantum physics, is used to model the dynamics of the bistable perception of ambiguous visual stimuli such as the Necker cube. The central idea is to treat the perception process in terms of the evolution of an unstable two-state system. This gives rise to a Necker-Zeno effect, in analogy to the quantum Zeno effect. A quantitative relation between the involved time scales is theoretically derived. This relation is found to be satisfied by empirically obtained cognitive time scales relevant for bistable perception.  相似文献   

11.
The projected pattern of retinal-image motion supplies the human visual system with valuable information about properties of the three-dimensional environment. How well three-dimensional properties can be recovered depends both on the accuracy with which the early motion system estimates retinal motion, and on the way later processes interpret this retinal motion. Here we combine both early and late stages of the computational process to account for the hitherto puzzling phenomenon of systematic biases in three-dimensional shape perception. We present data showing how the perceived depth of a hinged plane (''an open book'') can be systematically biased by the extent over which it rotates. We then present a Bayesian model that combines early measurement noise with geometric reconstruction of the three-dimensional scene. Although this model has no in-built bias towards particular three-dimensional shapes, it accounts for the data well. Our analysis suggests that the biases stem largely from the geometric constraints imposed on what three-dimensional scenes are compatible with the (noisy) early motion measurements. Given these findings, we suggest that the visual system may act as an optimal estimator of three-dimensional structure-from-motion.  相似文献   

12.
13.
We are surrounded by surfaces that we perceive by visual means. Understanding the basic principles behind this perceptual process is a central theme in visual psychology, psychophysics, and computational vision. In many of the computational models employed in the past, it has been assumed that a metric representation of physical space can be derived by visual means. Psychophysical experiments, as well as computational considerations, can convince us that the perception of space and shape has a much more complicated nature, and that only a distorted version of actual, physical space can be computed. This paper develops a computational geometric model that explains why such distortion might take place. The basic idea is that, both in stereo and motion, we perceive the world from multiple views. Given the rigid transformation between the views and the properties of the image correspondence, the depth of the scene can be obtained. Even a slight error in the rigid transformation parameters causes distortion of the computed depth of the scene. The unified framework introduced here describes this distortion in computational terms. We characterize the space of distortions by its level sets, that is, we characterize the systematic distortion via a family of iso-distortion surfaces which describes the locus over which depths are distorted by some multiplicative factor. Given that humans' estimation of egomotion or estimation of the extrinsic parameters of the stereo apparatus is likely to be imprecise, the framework is used to explain a number of psychophysical experiments on the perception of depth from motion or stereo. Received: 9 January 1997 / Accepted in revised form: 8 July 1997  相似文献   

14.
视觉运动信息的感知过程,包括从局域运动检测到对模式整体运动的感知过程.我们以蝇视觉系统的图形-背景相对运动分辨的神经回路网络为基本框架,采用初级运动检测器的六角形阵列作为输入层,构造了一种感知视觉运动信息的简化脑模型,模拟了运动信息应该神经计算模型各个层次上的处理.该模型对差分行为实验结果作出了正确预测.本文并对空间生理整合的神经机制作了讨论.  相似文献   

15.
People learn modality-independent, conceptual representations from modality-specific sensory signals. Here, we hypothesize that any system that accomplishes this feat will include three components: a representational language for characterizing modality-independent representations, a set of sensory-specific forward models for mapping from modality-independent representations to sensory signals, and an inference algorithm for inverting forward models—that is, an algorithm for using sensory signals to infer modality-independent representations. To evaluate this hypothesis, we instantiate it in the form of a computational model that learns object shape representations from visual and/or haptic signals. The model uses a probabilistic grammar to characterize modality-independent representations of object shape, uses a computer graphics toolkit and a human hand simulator to map from object representations to visual and haptic features, respectively, and uses a Bayesian inference algorithm to infer modality-independent object representations from visual and/or haptic signals. Simulation results show that the model infers identical object representations when an object is viewed, grasped, or both. That is, the model’s percepts are modality invariant. We also report the results of an experiment in which different subjects rated the similarity of pairs of objects in different sensory conditions, and show that the model provides a very accurate account of subjects’ ratings. Conceptually, this research significantly contributes to our understanding of modality invariance, an important type of perceptual constancy, by demonstrating how modality-independent representations can be acquired and used. Methodologically, it provides an important contribution to cognitive modeling, particularly an emerging probabilistic language-of-thought approach, by showing how symbolic and statistical approaches can be combined in order to understand aspects of human perception.  相似文献   

16.
This report describes experimental measurements of threshold contrasts as a function of the angle to the visual axis (peripheral threshold contrasts). The visual tasks consist in detection (perception of presence) and discrimination (perception of a form feature) of simple visual signs during a fixation period realistic observing conditions being chosen. Proceeding from the experimental findings a model for forecasting off-axis threshold contrast functions on different visual conditions is developed based upon spatial frequency filters. Further with the aid of a known model visibility fields are calculated.  相似文献   

17.
Temporal information is often contained in multi-sensory stimuli, but it is currently unknown how the brain combines e.g. visual and auditory cues into a coherent percept of time. The existing studies of cross-modal time perception mainly support the "modality appropriateness hypothesis", i.e. the domination of auditory temporal cues over visual ones because of the higher precision of audition for time perception. However, these studies suffer from methodical problems and conflicting results. We introduce a novel experimental paradigm to examine cross-modal time perception by combining an auditory time perception task with a visually guided motor task, requiring participants to follow an elliptic movement on a screen with a robotic manipulandum. We find that subjective duration is distorted according to the speed of visually observed movement: The faster the visual motion, the longer the perceived duration. In contrast, the actual execution of the arm movement does not contribute to this effect, but impairs discrimination performance by dual-task interference. We also show that additional training of the motor task attenuates the interference, but does not affect the distortion of subjective duration. The study demonstrates direct influence of visual motion on auditory temporal representations, which is independent of attentional modulation. At the same time, it provides causal support for the notion that time perception and continuous motor timing rely on separate mechanisms, a proposal that was formerly supported by correlational evidence only. The results constitute a counterexample to the modality appropriateness hypothesis and are best explained by Bayesian integration of modality-specific temporal information into a centralized "temporal hub".  相似文献   

18.
Much current vision research is predicated on the idea--and a rapidly growing body of evidence--that visual percepts are generated according to the empirical significance of light stimuli rather than their physical characteristics. As a result, an increasing number of investigators have asked how visual perception can be rationalized in these terms. Here, we compare two different theoretical frameworks for predicting what observers actually see in response to visual stimuli: Bayesian decision theory and empirical ranking theory. Deciding which of these approaches has greater merit is likely to determine how the statistical operations that apparently underlie visual perception are eventually understood.  相似文献   

19.
T S Meese 《Spatial Vision》1999,12(3):363-394
Visual neurons in the primary visual cortex 'look' at the retinal image through a four-dimensional array of spatial receptive fields (filter-elements): two spatial dimensions and, at each spatial location, two Fourier dimensions of spatial frequency and orientation. In general, visual objects activate filter-elements along each of these dimensions, suggesting a need for some kind of linking mechanism that determines whether two or more filter-elements are responding to the same or different contours or objects. In the spatial domain, a (spatial) association field between filter-elements, arranged to form first-order curves, has been inferred as a flexible method by which different parts of extended (luminance) contours become associated (Field et al., 1993). Linking has also been explored between filters selective for different regions in Fourier space (e.g. Georgeson and Meese, 1997). Perceived structure of stationary plaids suggests that spatial filtering is adaptive: synthetic filters can be created by the linear summation of basis-filters across orientation or spatial frequency in a stimulus-dependent way. For example, a plaid with a pair of sine-wave components at +/-45 deg looks like a blurred checkerboard; a structure that can be understood if features are derived after linear summation of spatial filters at different orientations. However, the addition of an oblique third-harmonic component causes the plaid to perceptually segment into overlapping oblique contours. This result can be understood if filters are summed across spatial frequency, but, in this case, treated independently across orientation. In the present paper, the architecture of an association field is proposed to permit linking and segmentation of filter-elements across spatial frequency and orientation. Three types of link are proposed: (1) A chain of constructive links around sites of common spatial frequency but different orientation, to promote binding of filters across orientation; (2) Constructive links between sites with common orientation but different spatial frequency, to promote binding of filters across spatial frequency; (3) Long-range links between sites of common spatial frequency but different orientation, whose activation and role are determined by activity in a higher spatial frequency band. A model employing the proposed network of links is consistent with at least six previously reported effects on the perception of briefly presented stationary plaids.  相似文献   

20.
A prevailing theory proposes that the brain''s two visual pathways, the ventral and dorsal, lead to differing visual processing and world representations for conscious perception than those for action. Others have claimed that perception and action share much of their visual processing. But which of these two neural architectures is favored by evolution? Successful visual search is life-critical and here we investigate the evolution and optimality of neural mechanisms mediating perception and eye movement actions for visual search in natural images. We implement an approximation to the ideal Bayesian searcher with two separate processing streams, one controlling the eye movements and the other stream determining the perceptual search decisions. We virtually evolved the neural mechanisms of the searchers'' two separate pathways built from linear combinations of primary visual cortex receptive fields (V1) by making the simulated individuals'' probability of survival depend on the perceptual accuracy finding targets in cluttered backgrounds. We find that for a variety of targets, backgrounds, and dependence of target detectability on retinal eccentricity, the mechanisms of the searchers'' two processing streams converge to similar representations showing that mismatches in the mechanisms for perception and eye movements lead to suboptimal search. Three exceptions which resulted in partial or no convergence were a case of an organism for which the targets are equally detectable across the retina, an organism with sufficient time to foveate all possible target locations, and a strict two-pathway model with no interconnections and differential pre-filtering based on parvocellular and magnocellular lateral geniculate cell properties. Thus, similar neural mechanisms for perception and eye movement actions during search are optimal and should be expected from the effects of natural selection on an organism with limited time to search for food that is not equi-detectable across its retina and interconnected perception and action neural pathways.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号