首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Visual saliency is a fundamental yet hard to define property of objects or locations in the visual world. In a context where objects and their representations compete to dominate our perception, saliency can be thought of as the "juice" that makes objects win the race. It is often assumed that saliency is extracted and represented in an explicit saliency map, which serves to determine the location of spatial attention at any given time. It is then by drawing attention to a salient object that it can be recognized or categorized. I argue against this classical view that visual "bottom-up" saliency automatically recruits the attentional system prior to object recognition. A number of visual processing tasks are clearly performed too fast for such a costly strategy to be employed. Rather, visual attention could simply act by biasing a saliency-based object recognition system. Under natural conditions of stimulation, saliency can be represented implicitly throughout the ventral visual pathway, independent of any explicit saliency map. At any given level, the most activated cells of the neural population simply represent the most salient locations. The notion of saliency itself grows increasingly complex throughout the system, mostly based on luminance contrast until information reaches visual cortex, gradually incorporating information about features such as orientation or color in primary visual cortex and early extrastriate areas, and finally the identity and behavioral relevance of objects in temporal cortex and beyond. Under these conditions the object that dominates perception, i.e. the object yielding the strongest (or the first) selective neural response, is by definition the one whose features are most "salient"--without the need for any external saliency map. In addition, I suggest that such an implicit representation of saliency can be best encoded in the relative times of the first spikes fired in a given neuronal population. In accordance with our subjective experience that saliency and attention do not modify the appearance of objects, the feed-forward propagation of this first spike wave could serve to trigger saliency-based object recognition outside the realm of awareness, while conscious perceptions could be mediated by the remaining discharges of longer neuronal spike trains.  相似文献   

2.
Huang X  Albright TD  Stoner GR 《Neuron》2007,53(5):761-770
Visual motion perception relies on two opposing operations: integration and segmentation. Integration overcomes motion ambiguity in the visual image by spatial pooling of motion signals, whereas segmentation identifies differences between adjacent moving objects. For visual motion area MT, previous investigations have reported that stimuli in the receptive field surround, which do not elicit a response when presented alone, can nevertheless modulate responses to stimuli in the receptive field center. The directional tuning of this "surround modulation" has been found to be mainly antagonistic and hence consistent with segmentation. Here, we report that surround modulation in area MT can be either antagonistic or integrative depending upon the visual stimulus. Both types of modulation were delayed relative to response onset. Our results suggest that the dominance of antagonistic modulation in previous MT studies was due to stimulus choice and that segmentation and integration are achieved, in part, via adaptive surround modulation.  相似文献   

3.
Brody CD  Hopfield JJ 《Neuron》2003,37(5):843-852
Spike synchronization across neurons can be selective for the situation where neurons are driven at similar firing rates, a "many are equal" computation. This can be achieved in the absence of synaptic interactions between neurons, through phase locking to a common underlying oscillatory potential. Based on this principle, we instantiate an algorithm for robust odor recognition into a model network of spiking neurons whose main features are taken from known properties of biological olfactory systems. Here, recognition of odors is signaled by spike synchronization of specific subsets of "mitral cells." This synchronization is highly odor selective and invariant to a wide range of odor concentrations. It is also robust to the presence of strong distractor odors, thus allowing odor segmentation within complex olfactory scenes. Information about odors is encoded in both the identity of glomeruli activated above threshold (1 bit of information per glomerulus) and in the analog degree of activation of the glomeruli (approximately 3 bits per glomerulus).  相似文献   

4.

Background

Optic flow is an important cue for object detection. Humans are able to perceive objects in a scene using only kinetic boundaries, and can perform the task even when other shape cues are not provided. These kinetic boundaries are characterized by the presence of motion discontinuities in a local neighbourhood. In addition, temporal occlusions appear along the boundaries as the object in front covers the background and the objects that are spatially behind it.

Methodology/Principal Findings

From a technical point of view, the detection of motion boundaries for segmentation based on optic flow is a difficult task. This is due to the problem that flow detected along such boundaries is generally not reliable. We propose a model derived from mechanisms found in visual areas V1, MT, and MSTl of human and primate cortex that achieves robust detection along motion boundaries. It includes two separate mechanisms for both the detection of motion discontinuities and of occlusion regions based on how neurons respond to spatial and temporal contrast, respectively. The mechanisms are embedded in a biologically inspired architecture that integrates information of different model components of the visual processing due to feedback connections. In particular, mutual interactions between the detection of motion discontinuities and temporal occlusions allow a considerable improvement of the kinetic boundary detection.

Conclusions/Significance

A new model is proposed that uses optic flow cues to detect motion discontinuities and object occlusion. We suggest that by combining these results for motion discontinuities and object occlusion, object segmentation within the model can be improved. This idea could also be applied in other models for object segmentation. In addition, we discuss how this model is related to neurophysiological findings. The model was successfully tested both with artificial and real sequences including self and object motion.  相似文献   

5.
The mechanisms of selective verbal attention were studied under conditions of simultaneous delivery of speech signals via the visual and auditory channels. The investigation was based on the comparison and synthesis of data obtained by two methods: positron emission tomography (PET) and brain evoked potentials (EPs). A new approach was developed: complementary tasks were constructed in such a way that, despite principal methodological problems, the same phenomenon could be investigated in one paradigm in EP and PET studies. The results obtained by the two methods are in rather good agreement with respect to topography: the secondary and tertiary areas, as well as the associative brain areas, are involved in attention concentration, that is, selection of verbal information occurs at the level of cognitive processes. The combination of two complementary methods, PET and EP, allowed the processes of processing of sensory information and brain mechanisms of selective attention to be investigated much more completely. The PET studies contributed to further understanding of brain mechanisms evidencing where processing occurs and the EP method provided insight into the mechanism of how this information is processed inside the corresponding cortical areas. The finding that the activation of primary areas of the visual cortex is accompanied by the inhibition of visual information deserves attention. This conclusion can be considered highly significant because of the concordance of the two independent methods. How to interpret it is not yet clear. It is possible that, in the case of primary importance of verbal information and priority of the visual channel for the repression from consciousness of artificially irrelevant information, a safety mechanism is activated: the amplified signal enters the brain cortex, where it is retained in the short-term iconic memory. This enables a reaction to this stimulus (if necessary), in the presence of any additional sign involving selective attention.  相似文献   

6.
7.
8.
Traditional stereo grouping models have focused on the problem of stereo correspondence between monocular inputs. Recent physiological data revealed that the disparity selective V2 cells increase their responses when (random-dot stereograms) stimuli within their receptive fields are at or near the boundary of a depth surface. Such highlights to depth (non-luminance) edges are seemingly not computationally required for the correspondence problem. Computationally, these highlights make the boundaries of a depth surface more salient, serving pre-attentive segmentation (between depth planes) and attracting visual attention. In special cases, they enable the psychophysically observed perceptual pop-out of a target from a background of visually identical distractors at a different depth. To achieve the highlights, mutual inhibition between disparity selective cells that are tuned to the same or similar depths is required. However, such mutual inhibition would impede the computation for the correspondence problem, which requires mutual excitation between the same cells. In this work, I introduce a computational model that, I believe, is the first to address both stereo correspondence and pre-attentive stereo segmentation. The computational mechanisms in the model are based on intracortical interactions in V2. I will demonstrate that the model captures the following physiological and psychophysical phenomena: (i) depth-edge highlighting; (ii) disparity capture; (iii) pop-out; and (iv) transparency.  相似文献   

9.
多方式认知功能成像研究进展   总被引:4,自引:1,他引:4  
对大脑结构和功能的深入研究要求认知功能成像技术同时具有高时间分辨率和高空间分辨率.多方式认知功能成像通过不同成像技术fMRI/PET和EEG/MEG的结合,能够同时在空间定位和时间过程上研究大脑认知活动的动态过程.多方式认知功能成像已经被成功地应用于选择性注意、视觉通路、随意运动和语义加工等的研究,并揭示了相关大脑活动的空间和时间特征.今后的研究将进一步提高多方式认知功能成像的时空分辨率和准确性,以更深入地探索认知功能的神经机制.  相似文献   

10.
Human beings have the capacity to recognize objects in natural visual scenes with high efficiency despite the complexity of such scenes, which usually contain multiple objects. One possible mechanism for dealing with this problem is selective attention. Psychophysical evidence strongly suggests that selective attention can enhance the spatial resolution in the input region corresponding to the focus of attention. In this work we adopt a computational neuroscience perspective to analyze the attentional enhancement of spatial resolution in the area containing the objects of interest. We extend and apply the computational model of Deco and Schürmann (2000), which consists of several modules with feedforward and feedback interconnections describing the mutual links between different areas of the visual cortex. Each module analyses the visual input with different spatial resolution and can be thought of as a hierarchical predictor at a given level of resolution. Moreover, each hierarchical predictor has a submodule that consists of a group of neurons performing a biologically based 2D Gabor wavelet transformation at a given resolution level. The attention control decides in which local regions the spatial resolution should be enhanced in a serial fashion. In this sense, the scene is first analyzed at a coarse resolution level, and the focus of attention enhances iteratively the resolution at the location of an object until the object is identified. We propose and simulate new psychophysical experiments where the effect of the attentional enhancement of spatial resolution can be demonstrated by predicting different reaction time profiles in visual search experiments where the target and distractors are defined at different levels of resolution.  相似文献   

11.
《Journal of Physiology》2013,107(5):338-348
Ganglion cells in the vertebrate retina integrate visual information over their receptive fields. They do so by pooling presynaptic excitatory inputs from typically many bipolar cells, which themselves collect inputs from several photoreceptors. In addition, inhibitory interactions mediated by horizontal cells and amacrine cells modulate the structure of the receptive field. In many models, this spatial integration is assumed to occur in a linear fashion. Yet, it has long been known that spatial integration by retinal ganglion cells also incurs nonlinear phenomena. Moreover, several recent examples have shown that nonlinear spatial integration is tightly connected to specific visual functions performed by different types of retinal ganglion cells. This work discusses these advances in understanding the role of nonlinear spatial integration and reviews recent efforts to quantitatively study the nature and mechanisms underlying spatial nonlinearities. These new insights point towards a critical role of nonlinearities within ganglion cell receptive fields for capturing responses of the cells to natural and behaviorally relevant visual stimuli. In the long run, nonlinear phenomena of spatial integration may also prove important for implementing the actual neural code of retinal neurons when designing visual prostheses for the eye.  相似文献   

12.
The spatial pooling method such as spatial pyramid matching (SPM) is very crucial in the bag of features model used in image classification. SPM partitions the image into a set of regular grids and assumes that the spatial layout of all visual words obey the uniform distribution over these regular grids. However, in practice, we consider that different visual words should obey different spatial layout distributions. To improve SPM, we develop a novel spatial pooling method, namely spatial distribution pooling (SDP). The proposed SDP method uses an extension model of Gauss mixture model to estimate the spatial layout distributions of the visual vocabulary. For each visual word type, SDP can generate a set of flexible grids rather than the regular grids from the traditional SPM. Furthermore, we can compute the grid weights for visual word tokens according to their spatial coordinates. The experimental results demonstrate that SDP outperforms the traditional spatial pooling methods, and is competitive with the state-of-the-art classification accuracy on several challenging image datasets.  相似文献   

13.
Neurons in the primary visual cortex are selective to orientation with various degrees of selectivity to the spatial phase, from high selectivity in simple cells to low selectivity in complex cells. Various computational models have suggested a possible link between the presence of phase invariant cells and the existence of orientation maps in higher mammals’ V1. These models, however, do not explain the emergence of complex cells in animals that do not show orientation maps. In this study, we build a theoretical model based on a convolutional network called Sparse Deep Predictive Coding (SDPC) and show that a single computational mechanism, pooling, allows the SDPC model to account for the emergence in V1 of complex cells with or without that of orientation maps, as observed in distinct species of mammals. In particular, we observed that pooling in the feature space is directly related to the orientation map formation while pooling in the retinotopic space is responsible for the emergence of a complex cells population. Introducing different forms of pooling in a predictive model of early visual processing as implemented in SDPC can therefore be viewed as a theoretical framework that explains the diversity of structural and functional phenomena observed in V1.  相似文献   

14.
15.
Bentley P  Husain M  Dolan RJ 《Neuron》2004,41(6):969-982
We compared behavioral and neural effects of cholinergic enhancement between spatial attention, spatial working memory (WM), and visual control tasks, using fMRI and the anticholinesterase physostigmine. Physostigmine speeded responses nonselectively but increased accuracy selectively for attention. Physostigmine also decreased activations to visual stimulation across all tasks within primary visual cortex, increased extrastriate occipital cortex activation selectively during maintained attention and WM encoding, and decreased parietal activation selectively during maintained attention. Finally, lateralization of occipital activation as a function of the visual hemifield toward which attention or memory was directed was decreased under physostigmine. In the case of attention, this effect correlated strongly with a decrease in a behavioral measure of selective spatial processing. Our results suggest that, while cholinergic enhancement facilitates visual attention by increasing activity in extrastriate cortex generally, it accomplishes this in a manner that reduces expectation-driven selective biasing of extrastriate cortex.  相似文献   

16.
Kuzmina M  Manykin E  Surina I 《Bio Systems》2004,76(1-3):43-53
An oscillatory network of columnar architecture located in 3D spatial lattice was recently designed by the authors as oscillatory model of the brain visual cortex. Single network oscillator is a relaxational neural oscillator with internal dynamics tunable by visual image characteristics - local brightness and elementary bar orientation. It is able to demonstrate either activity state (stable undamped oscillations) or "silence" (quickly damped oscillations). Self-organized nonlocal dynamical connections of oscillators depend on oscillator activity levels and orientations of cortical receptive fields. Network performance consists in transfer into a state of clusterized synchronization. At current stage grey-level image segmentation tasks are carried out by 2D oscillatory network, obtained as a limit version of the source model. Due to supplemented network coupling strength control the 2D reduced network provides synchronization-based image segmentation. New results on segmentation of brightness and texture images presented in the paper demonstrate accurate network performance and informative visualization of segmentation results, inherent in the model.  相似文献   

17.
A computational theory of visual attention is presented. The basic theory (TVA) combines the biased-choice model for single-stimulus recognition with the fixed-capacity independent race model (FIRM) for selection from multi-element displays. TVA organizes a large body of experimental findings on performance in visual recognition and attention tasks. A recent development (CTVA) combines TVA with a theory of perceptual grouping by proximity. CTVA explains effects of perceptual grouping and spatial distance between items in multi-element displays. A new account of spatial focusing is proposed in this paper. The account provides a framework for understanding visual search as an interplay between serial and parallel processes.  相似文献   

18.
Columnar architecture is a well established organizational principle for a variety of cortical systems. If two topographically mapped receptor systems, which receive slightly different views of the same physical stimulus, are interlaced as columns, then the difference map of the afferent inputs is coded within a spatial frequency channel of the resultant map. The difference map of the left and right retinal views of a three dimensensional scene contains cues for the binocular disparity of the objects in the scene. Physical objects which are located at a common distance from the observer will be represented by area's of difference mapping which possesss common cortical textural values. Thus, segmentation of the cortical representation of the visual scene by values of positional disparity may be accomplished by conventional monocular segmentation techniques, applied to the cortical representation.The difference map is carried by a spatial frequency modulation determined by the period of the columnar interlacing. Ocular dominance columns in human striate cortex suggest a spatial frequency carrier which is roughly equal to the inverse of Panum's area. Since the difference mapping is a global attribute of the cortical representation, and is not contingent on the existence of labeled single cell feature extractors, the difference mapping algorithm represents a distinct alternative to conventional single cell approaches to feature extraction.The difference mapping algorithm is briefly discussed in relation to other difference channels, such as color opponent segmentation and binocular orientation disparity. It is suggested that difference mapping may reflect a general synergistic mechanism relating topographic mapping and columnar architecture, which reduces the problem of feature extraction and segmentation for depth and color opponent channels to a single textural mechanism.  相似文献   

19.
A review. Recently published articles concerning the problem of attention are discussed, the most popular psychophysiological concepts and neurophysiological models of attention are described, and correlation of spatial attention and saccadic eyes movements is shown. The evidence for reflection of attention mechanisms and saccade preparation in intensity and topography of the visual evoked potentials and event-related potentials is given. On the basis of the results obtained by the authors and literature data, the contribution of attention to preparation of a saccade and its programming is shown. Different kinds of attention are reflected in a complex of EEG potentials of various duration and polarity. The analysis of parameters and topography of these potentials can serve a tool for investigation of the attention mechanisms.  相似文献   

20.
The effects of spatial selective attention upon ERPs associated with the processing of word stimuli were investigated. While subjects maintained central eye fixation, ERPs were recorded to words presented to the left and right visual fields. In each of 6 runs, subjects focussed attention to alternate fields to perform a category-detection task. Pairs of semantically related and repeated words were embedded in the word lists presented to the attended and unattended visual fields. Consistent with prior studies, the P1-N1 visual ERP was larger when elicited by words in attended spatial locations. A large negative slow wave identified as N400 was elicited by attended, but not unattended, words. For attended words, N400 was smaller for semantically primed or repeated words. We concluded that spatial selective attention can modulate the degree to which words are processed, and that the cognitive processes associated with N400 are not automatic.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号