首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
An important requirement for vision is to identify interesting and relevant regions of the environment for further processing. Some models assume that salient locations from a visual scene are encoded in a dedicated spatial saliency map [1, 2]. Then, a winner-take-all (WTA) mechanism [1, 2] is often believed to threshold the graded saliency representation and identify the most salient position in the visual field. Here we aimed to assess whether neural representations of graded saliency and the subsequent WTA mechanism can be dissociated. We presented images of natural scenes while subjects were in a scanner performing a demanding fixation task, and thus their attention was directed away. Signals in early visual cortex and posterior intraparietal sulcus (IPS) correlated with graded saliency as defined by a computational saliency model. Multivariate pattern classification [3, 4] revealed that the most salient position in the visual field was encoded in anterior IPS and frontal eye fields (FEF), thus reflecting a potential WTA stage. Our results thus confirm that graded saliency and WTA-thresholded saliency are encoded in distinct neural structures. This could provide the neural representation required for rapid and automatic orientation toward salient events in natural environments.  相似文献   

2.
Saliency detection is widely used in many visual applications like image segmentation, object recognition and classification. In this paper, we will introduce a new method to detect salient objects in natural images. The approach is based on a regional principal color contrast modal, which incorporates low-level and medium-level visual cues. The method allows a simple computation of color features and two categories of spatial relationships to a saliency map, achieving higher F-measure rates. At the same time, we present an interpolation approach to evaluate resulting curves, and analyze parameters selection. Our method enables the effective computation of arbitrary resolution images. Experimental results on a saliency database show that our approach produces high quality saliency maps and performs favorably against ten saliency detection algorithms.  相似文献   

3.
Our nervous system is confronted with a barrage of sensory stimuli, but neural resources are limited and not all stimuli can be processed to the same extent. Mechanisms exist to bias attention toward the particularly salient events, thereby providing a weighted representation of our environment. Our understanding of these mechanisms is still limited, but theoretical models can replicate such a weighting of sensory inputs and provide a basis for understanding the underlying principles. Here, we describe such a model for the auditory system-an auditory saliency map. We experimentally validate the model on natural acoustical scenarios, demonstrating that it reproduces human judgments of auditory saliency and predicts the detectability of salient sounds embedded in noisy backgrounds. In addition, it also predicts the natural orienting behavior of naive macaque monkeys to the same salient stimuli. The structure of the suggested model is identical to that of successfully used visual saliency maps. Hence, we conclude that saliency is determined either by implementing similar mechanisms in different unisensory pathways or by the same mechanism in multisensory areas. In any case, our results demonstrate that different primate sensory systems rely on common principles for extracting relevant sensory events.  相似文献   

4.
Many saliency computational models have been proposed to simulate bottom-up visual attention mechanism of human visual system. However, most of them only deal with certain kinds of images or aim at specific applications. In fact, human beings have the ability to correctly select attentive focuses of objects with arbitrary sizes within any scenes. This paper proposes a new bottom-up computational model from the perspective of frequency domain based on the biological discovery of non-Classical Receptive Field (nCRF) in the retina. A saliency map can be obtained according to the idea of Extended Classical Receptive Field. The model is composed of three major steps: firstly decompose the input image into several feature maps representing different frequency bands that cover the whole frequency domain by utilizing Gabor wavelet. Secondly, whiten the feature maps to highlight the embedded saliency information. Thirdly, select some optimal maps, simulating the response of receptive field especially nCRF, to generate the saliency map. Experimental results show that the proposed algorithm is able to work with stable effect and outstanding performance in a variety of situations as human beings do and is adaptive to both psychological patterns and natural images. Beyond that, biological plausibility of nCRF and Gabor wavelet transform make this approach reliable.  相似文献   

5.
Saliency detection attracted attention of many researchers and had become a very active area of research. Recently, many saliency detection models have been proposed and achieved excellent performance in various fields. However, most of these models only consider low-level features. This paper proposes a novel saliency detection model using both color and texture features and incorporating higher-level priors. The SLIC superpixel algorithm is applied to form an over-segmentation of the image. Color saliency map and texture saliency map are calculated based on the region contrast method and adaptive weight. Higher-level priors including location prior and color prior are incorporated into the model to achieve a better performance and full resolution saliency map is obtained by using the up-sampling method. Experimental results on three datasets demonstrate that the proposed saliency detection model outperforms the state-of-the-art models.  相似文献   

6.
The automatic computerized detection of regions of interest (ROI) is an important step in the process of medical image processing and analysis. The reasons are many, and include an increasing amount of available medical imaging data, existence of inter-observer and inter-scanner variability, and to improve the accuracy in automatic detection in order to assist doctors in diagnosing faster and on time. A novel algorithm, based on visual saliency, is developed here for the identification of tumor regions from MR images of the brain. The GBM saliency detection model is designed by taking cue from the concept of visual saliency in natural scenes. A visually salient region is typically rare in an image, and contains highly discriminating information, with attention getting immediately focused upon it. Although color is typically considered as the most important feature in a bottom-up saliency detection model, we circumvent this issue in the inherently gray scale MR framework. We develop a novel pseudo-coloring scheme, based on the three MRI sequences, viz. FLAIR, T2 and T1C (contrast enhanced with Gadolinium). A bottom-up strategy, based on a new pseudo-color distance and spatial distance between image patches, is defined for highlighting the salient regions in the image. This multi-channel representation of the image and saliency detection model help in automatically and quickly isolating the tumor region, for subsequent delineation, as is necessary in medical diagnosis. The effectiveness of the proposed model is evaluated on MRI of 80 subjects from the BRATS database in terms of the saliency map values. Using ground truth of the tumor regions for both high- and low- grade gliomas, the results are compared with four highly referred saliency detection models from literature. In all cases the AUC scores from the ROC analysis are found to be more than 0.999 ± 0.001 over different tumor grades, sizes and positions.  相似文献   

7.
Xu J  Yang Z  Tsien JZ 《PloS one》2010,5(12):e15796
Visual saliency is the perceptual quality that makes some items in visual scenes stand out from their immediate contexts. Visual saliency plays important roles in natural vision in that saliency can direct eye movements, deploy attention, and facilitate tasks like object detection and scene understanding. A central unsolved issue is: What features should be encoded in the early visual cortex for detecting salient features in natural scenes? To explore this important issue, we propose a hypothesis that visual saliency is based on efficient encoding of the probability distributions (PDs) of visual variables in specific contexts in natural scenes, referred to as context-mediated PDs in natural scenes. In this concept, computational units in the model of the early visual system do not act as feature detectors but rather as estimators of the context-mediated PDs of a full range of visual variables in natural scenes, which directly give rise to a measure of visual saliency of any input stimulus. To test this hypothesis, we developed a model of the context-mediated PDs in natural scenes using a modified algorithm for independent component analysis (ICA) and derived a measure of visual saliency based on these PDs estimated from a set of natural scenes. We demonstrated that visual saliency based on the context-mediated PDs in natural scenes effectively predicts human gaze in free-viewing of both static and dynamic natural scenes. This study suggests that the computation based on the context-mediated PDs of visual variables in natural scenes may underlie the neural mechanism in the early visual cortex for detecting salient features in natural scenes.  相似文献   

8.
White blood cell (WBC) detection plays a vital role in peripheral blood smear analysis. However, cell detection remains a challenging task due to multi-cell adhesion, different staining and imaging conditions. Owing to the powerful feature extraction capability of deep learning, object detection methods based on convolutional neural networks (CNNs) have been widely applied in medical image analysis. Nevertheless, the CNN training is time-consuming and inaccuracy, especially for large-scale blood smear images, where most of the images are background. To address the problem, we propose a two-stage approach that treats WBC detection as a small salient object detection task. In the first saliency detection stage, we use the Itti's visual attention model to locate the regions of interest (ROIs), based on the proposed adaptive center-surround difference (ACSD) operator. In the second WBC detection stage, the modified CenterNet model is performed on ROI sub-images to obtain a more accurate localization and classification result of each WBC. Experimental results showed that our method exceeds the performance of several existing methods on two different data sets, and achieves a state-of-the-art mAP of over 98.8%.  相似文献   

9.
Visual saliency is a fundamental yet hard to define property of objects or locations in the visual world. In a context where objects and their representations compete to dominate our perception, saliency can be thought of as the "juice" that makes objects win the race. It is often assumed that saliency is extracted and represented in an explicit saliency map, which serves to determine the location of spatial attention at any given time. It is then by drawing attention to a salient object that it can be recognized or categorized. I argue against this classical view that visual "bottom-up" saliency automatically recruits the attentional system prior to object recognition. A number of visual processing tasks are clearly performed too fast for such a costly strategy to be employed. Rather, visual attention could simply act by biasing a saliency-based object recognition system. Under natural conditions of stimulation, saliency can be represented implicitly throughout the ventral visual pathway, independent of any explicit saliency map. At any given level, the most activated cells of the neural population simply represent the most salient locations. The notion of saliency itself grows increasingly complex throughout the system, mostly based on luminance contrast until information reaches visual cortex, gradually incorporating information about features such as orientation or color in primary visual cortex and early extrastriate areas, and finally the identity and behavioral relevance of objects in temporal cortex and beyond. Under these conditions the object that dominates perception, i.e. the object yielding the strongest (or the first) selective neural response, is by definition the one whose features are most "salient"--without the need for any external saliency map. In addition, I suggest that such an implicit representation of saliency can be best encoded in the relative times of the first spikes fired in a given neuronal population. In accordance with our subjective experience that saliency and attention do not modify the appearance of objects, the feed-forward propagation of this first spike wave could serve to trigger saliency-based object recognition outside the realm of awareness, while conscious perceptions could be mediated by the remaining discharges of longer neuronal spike trains.  相似文献   

10.
A unique vertical bar among horizontal bars is salient and pops out perceptually. Physiological data have suggested that mechanisms in the primary visual cortex (V1) contribute to the high saliency of such a unique basic feature, but indicated little regarding whether V1 plays an essential or peripheral role in input-driven or bottom-up saliency. Meanwhile, a biologically based V1 model has suggested that V1 mechanisms can also explain bottom-up saliencies beyond the pop-out of basic features, such as the low saliency of a unique conjunction feature such as a red vertical bar among red horizontal and green vertical bars, under the hypothesis that the bottom-up saliency at any location is signaled by the activity of the most active cell responding to it regardless of the cell's preferred features such as color and orientation. The model can account for phenomena such as the difficulties in conjunction feature search, asymmetries in visual search, and how background irregularities affect ease of search. In this paper, we report nontrivial predictions from the V1 saliency hypothesis, and their psychophysical tests and confirmations. The prediction that most clearly distinguishes the V1 saliency hypothesis from other models is that task-irrelevant features could interfere in visual search or segmentation tasks which rely significantly on bottom-up saliency. For instance, irrelevant colors can interfere in an orientation-based task, and the presence of horizontal and vertical bars can impair performance in a task based on oblique bars. Furthermore, properties of the intracortical interactions and neural selectivities in V1 predict specific emergent phenomena associated with visual grouping. Our findings support the idea that a bottom-up saliency map can be at a lower visual area than traditionally expected, with implications for top-down selection mechanisms.  相似文献   

11.
Numerous studies have suggested that the deployment of attention is linked to saliency. In contrast, very little is known about how salient objects are perceived. To probe the perception of salient elements, observers compared two horizontally aligned stimuli in an array of eight elements. One of them was salient because of its orientation or direction of motion. We observed that the perceived luminance contrast or color saturation of the salient element increased: the salient stimulus looked even more salient. We explored the possibility that changes in appearance were caused by attention. We chose an event-related potential indexing attentional selection, the N2pc, to answer this question. The absence of an N2pc to the salient object provides preliminary evidence against involuntary attentional capture by the salient element. We suggest that signals from a master saliency map flow back into individual feature maps. These signals boost the perceived feature contrast of salient objects, even on perceptual dimensions different from the one that initially defined saliency.  相似文献   

12.
Attention is intrinsic to our perceptual representations of sensory inputs. Best characterized in the visual domain, it is typically depicted as a spotlight moving over a saliency map that topographically encodes strengths of visual features and feedback modulations over the visual scene. By introducing smells to two well-established attentional paradigms, the dot-probe and the visual-search paradigms, we find that a smell reflexively directs attention to the congruent visual image and facilitates visual search of that image without the mediation of visual imagery. Furthermore, such effect is independent of, and can override, top-down bias. We thus propose that smell quality acts as an object feature whose presence enhances the perceptual saliency of that object, thereby guiding the spotlight of visual attention. Our discoveries provide robust empirical evidence for a multimodal saliency map that weighs not only visual but also olfactory inputs.  相似文献   

13.
This work proposes a model of visual bottom-up attention for dynamic scene analysis. Our work adds motion saliency calculations to a neural network model with realistic temporal dynamics [(e.g., building motion salience on top of De Brecht and Saiki Neural Networks 19:1467–1474, (2006)]. The resulting network elicits strong transient responses to moving objects and reaches stability within a biologically plausible time interval. The responses are statistically different comparing between earlier and later motion neural activity; and between moving and non-moving objects. We demonstrate the network on a number of synthetic and real dynamical movie examples. We show that the model captures the motion saliency asymmetry phenomenon. In addition, the motion salience computation enables sudden-onset moving objects that are less salient in the static scene to rise above others. Finally, we include strong consideration for the neural latencies, the Lyapunov stability, and the neural properties being reproduced by the model.  相似文献   

14.
During free-viewing of natural scenes, eye movements are guided by bottom-up factors inherent to the stimulus, as well as top-down factors inherent to the observer. The question of how these two different sources of information interact and contribute to fixation behavior has recently received a lot of attention. Here, a battery of 15 visual stimulus features was used to quantify the contribution of stimulus properties during free-viewing of 4 different categories of images (Natural, Urban, Fractal and Pink Noise). Behaviorally relevant information was estimated in the form of topographical interestingness maps by asking an independent set of subjects to click at image regions that they subjectively found most interesting. Using a Bayesian scheme, we computed saliency functions that described the probability of a given feature to be fixated. In the case of stimulus features, the precise shape of the saliency functions was strongly dependent upon image category and overall the saliency associated with these features was generally weak. When testing multiple features jointly, a linear additive integration model of individual saliencies performed satisfactorily. We found that the saliency associated with interesting locations was much higher than any low-level image feature and any pair-wise combination thereof. Furthermore, the low-level image features were found to be maximally salient at those locations that had already high interestingness ratings. Temporal analysis showed that regions with high interestingness ratings were fixated as early as the third fixation following stimulus onset. Paralleling these findings, fixation durations were found to be dependent mainly on interestingness ratings and to a lesser extent on the low-level image features. Our results suggest that both low- and high-level sources of information play a significant role during exploration of complex scenes with behaviorally relevant information being more effective compared to stimulus features.  相似文献   

15.
Visual attention: the where,what, how and why of saliency   总被引:6,自引:0,他引:6  
Attention influences the processing of visual information even in the earliest areas of primate visual cortex. There is converging evidence that the interaction of bottom-up sensory information and top-down attentional influences creates an integrated saliency map, that is, a topographic representation of relative stimulus strength and behavioral relevance across visual space. This map appears to be distributed across areas of the visual cortex, and is closely linked to the oculomotor system that controls eye movements and orients the gaze to locations in the visual scene characterized by a high salience.  相似文献   

16.
In this study we investigated visual attention properties of freely behaving barn owls, using a miniature wireless camera attached to their heads. The tubular eye structure of barn owls makes them ideal subjects for this research since it limits their eye movements. Video sequences recorded from the owl’s point of view capture part of the visual scene as seen by the owl. Automated analysis of video sequences revealed that during an active search task, owls repeatedly and consistently direct their gaze in a way that brings objects of interest to a specific retinal location (retinal fixation area). Using a projective model that captures the geometry between the eye and the camera, we recovered the corresponding location in the recorded images (image fixation area). Recording in various types of environments (aviary, office, outdoors) revealed significant statistical differences of low level image properties at the image fixation area compared to values extracted at random image patches. These differences are in agreement with results obtained in primates in similar studies. To investigate the role of saliency and its contribution to drawing the owl’s attention, we used a popular bottom-up computational model. Saliency values at the image fixation area were typically greater than at random patches, yet were only 20% out of the maximal saliency value, suggesting a top-down modulation of gaze control.  相似文献   

17.
By representing image content using probabilistic models of an object''s appearance we can obtain semantics-preserving compression of the image data. Such compact representations of an image''s salient features allow rapid computer searches of even large image databases. Examples are shown for databases of face images, a video of American sign language (ASL), and a video of facial expressions.  相似文献   

18.
During the development of the topographic map from vertebrate retina to superior colliculus (SC), EphA receptors are expressed in a gradient along the nasotemporal retinal axis. Their ligands, ephrin-As, are expressed in a gradient along the rostrocaudal axis of the SC. Countergradients of ephrin-As in the retina and EphAs in the SC are also expressed. Disruption of any of these gradients leads to mapping errors. Gierer''s (1981) model, which uses well-matched pairs of gradients and countergradients to establish the mapping, can account for the formation of wild type maps, but not the double maps found in EphA knock-in experiments. I show that these maps can be explained by models, such as Gierer''s (1983), which have gradients and no countergradients, together with a powerful compensatory mechanism that helps to distribute connections evenly over the target region. However, this type of model cannot explain mapping errors found when the countergradients are knocked out partially. I examine the relative importance of countergradients as against compensatory mechanisms by generalising Gierer''s (1983) model so that the strength of compensation is adjustable. Either matching gradients and countergradients alone or poorly matching gradients and countergradients together with a strong compensatory mechanism are sufficient to establish an ordered mapping. With a weaker compensatory mechanism, gradients without countergradients lead to a poorer map, but the addition of countergradients improves the mapping. This model produces the double maps in simulated EphA knock-in experiments and a map consistent with the Math5 knock-out phenotype. Simulations of a set of phenotypes from the literature substantiate the finding that countergradients and compensation can be traded off against each other to give similar maps. I conclude that a successful model of retinotopy should contain countergradients and some form of compensation mechanism, but not in the strong form put forward by Gierer.  相似文献   

19.
This work analyzed the perceptual attributes of natural dynamic audiovisual scenes. We presented thirty participants with 19 natural scenes in a similarity categorization task, followed by a semi-structured interview. The scenes were reproduced with an immersive audiovisual display. Natural scene perception has been studied mainly with unimodal settings, which have identified motion as one of the most salient attributes related to visual scenes, and sound intensity along with pitch trajectories related to auditory scenes. However, controlled laboratory experiments with natural multimodal stimuli are still scarce. Our results show that humans pay attention to similar perceptual attributes in natural scenes, and a two-dimensional perceptual map of the stimulus scenes and perceptual attributes was obtained in this work. The exploratory results show the amount of movement, perceived noisiness, and eventfulness of the scene to be the most important perceptual attributes in naturalistically reproduced real-world urban environments. We found the scene gist properties openness and expansion to remain as important factors in scenes with no salient auditory or visual events. We propose that the study of scene perception should move forward to understand better the processes behind multimodal scene processing in real-world environments. We publish our stimulus scenes as spherical video recordings and sound field recordings in a publicly available database.  相似文献   

20.
Saliency maps produced by different algorithms are often evaluated by comparing output to fixated image locations appearing in human eye tracking data. There are challenges in evaluation based on fixation data due to bias in the data. Properties of eye movement patterns that are independent of image content may limit the validity of evaluation results, including spatial bias in fixation data. To address this problem, we present modeling and evaluation results for data derived from different perceptual tasks related to the concept of saliency. We also present a novel approach to benchmarking to deal with some of the challenges posed by spatial bias. The results presented establish the value of alternatives to fixation data to drive improvement and development of models. We also demonstrate an approach to approximate the output of alternative perceptual tasks based on computational saliency and/or eye gaze data. As a whole, this work presents novel benchmarking results and methods, establishes a new performance baseline for perceptual tasks that provide an alternative window into visual saliency, and demonstrates the capacity for saliency to serve in approximating human behaviour for one visual task given data from another.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号