首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In its early stages, the visual system suffers from a lot of ambiguity and noise that severely limits the performance of early vision algorithms. This article presents feedback mechanisms between early visual processes, such as perceptual grouping, stereopsis and depth reconstruction, that allow the system to reduce this ambiguity and improve early representation of visual information. In the first part, the article proposes a local perceptual grouping algorithm that — in addition to commonly used geometric information — makes use of a novel multi–modal measure between local edge/line features. The grouping information is then used to: 1) disambiguate stereopsis by enforcing that stereo matches preserve groups; and 2) correct the reconstruction error due to the image pixel sampling using a linear interpolation over the groups. The integration of mutual feedback between early vision processes is shown to reduce considerably ambiguity and noise without the need for global constraints.  相似文献   

2.
The appearance of faces can be strongly affected by the characteristics of faces viewed previously. These perceptual after-effects reflect processes of sensory adaptation that are found throughout the visual system, but which have been considered only relatively recently in the context of higher level perceptual judgements. In this review, we explore the consequences of adaptation for human face perception, and the implications of adaptation for understanding the neural-coding schemes underlying the visual representation of faces. The properties of face after-effects suggest that they, in part, reflect response changes at high and possibly face-specific levels of visual processing. Yet, the form of the after-effects and the norm-based codes that they point to show many parallels with the adaptations and functional organization that are thought to underlie the encoding of perceptual attributes like colour. The nature and basis for human colour vision have been studied extensively, and we draw on ideas and principles that have been developed to account for norms and normalization in colour vision to consider potential similarities and differences in the representation and adaptation of faces.  相似文献   

3.
Texture of various appearances, geometric distortions, spatial frequency content and densities is utilized by the human visual system to segregate items from background and to enable recognition of complex geometric forms. For automatic, or pre-attentive, segmentation of a visual scene, sophisticated analysis and comparison of surface properties over wide areas of the visual field are required. We investigated the neural substrate underlying human texture processing, particularly the computational mechanisms of texture boundary detection. We present a neural network model which uses as building blocks model cortical areas that are bi-directionally linked to implement cycles of feedforward and feedback interaction for signal detection, hypothesis generation and testing within the infero-temporal pathway of form processing. In the spirit of Jake Beck's early investigations our model particularly builds upon two key hypotheses, namely that (i) texture segregation is based on boundary detection, rather than clustering homogeneous items, and (ii) texture boundaries are detected mainly on the basis of larger scenic contexts mediated by higher cortical areas, such as area V4. The latter constraint provides a basis for element grouping in accordance to the Gestalt laws of similarity and good continuation. It is shown through simulations that the model integrates a variety of psychophysical findings on texture processing and provides a link to the underlying physiology. The functional role of feedback processing is demonstrated by context dependent modulation of V1 cell activation, leading to sharply localized detection of texture boundaries. It furthermore explains why pre-attentive processing in visual search tasks can be directly linked to texture boundary processing as revealed by recent EEG studies on visual search.  相似文献   

4.
Recently we introduced a new version of the perceptual retouch model incorporating two interactive binding operations—binding features for objects and binding the bound feature-objects with a large scale oscillatory system that acts as a mediary for the perceptual information to reach consciousness-level representation. The relative level of synchronized firing of the neurons representing the features of an object obtained after the second-stage synchronizing modulation is used as the equivalent of conscious perception of the corresponding object. Here, this model is used for simulating interaction of two successive featured objects as a function of stimulus onset asynchrony (SOA). Model output reproduces typical results of mutual masking—with shortest and longest SOAs first and second object correct perception rate is comparable while with intermediate SOAs second object dominates over the first one. Additionally, with shortest SOAs misbinding of features to form illusory objects is simulated by the model.  相似文献   

5.
复杂刺激的知觉学习是指由训练或经验引起的对物体或者面孔等复杂视觉刺激在知觉上长期稳定的改变,一般认为这反映了大脑高级视皮层的可塑性.对简单刺激知觉学习特性的研究已经揭示了低级视皮层的部分可塑性,但是复杂刺激知觉学习的神经机制目前仍存在争议.本文介绍了知觉学习的理论模型和实验证据,并重点探讨了复杂刺激如物体和面孔知觉学习的特性、神经机制及研究方法.该领域未来需要在复杂刺激知觉学习的持久性、面孔不同属性知觉学习的机制,以及复杂刺激知觉学习的理论模型方面做进一步研究.  相似文献   

6.
Both dorsal and ventral cortical visual streams contain neurons sensitive to binocular disparities, but the two streams may underlie different aspects of stereoscopic vision. Here we investigate stereopsis in the neurological patient D.F., whose ventral stream, specifically lateral occipital cortex, has been damaged bilaterally, causing profound visual form agnosia. Despite her severe damage to cortical visual areas, we report that DF''s stereo vision is strikingly unimpaired. She is better than many control observers at using binocular disparity to judge whether an isolated object appears near or far, and to resolve ambiguous structure-from-motion. DF is, however, poor at using relative disparity between features at different locations across the visual field. This may stem from a difficulty in identifying the surface boundaries where relative disparity is available. We suggest that the ventral processing stream may play a critical role in enabling healthy observers to extract fine depth information from relative disparities within one surface or between surfaces located in different parts of the visual field.  相似文献   

7.
Visual perception is burdened with a highly discontinuous input stream arising from saccadic eye movements. For successful integration into a coherent representation, the visuomotor system needs to deal with these self-induced perceptual changes and distinguish them from external motion. Forward models are one way to solve this problem where the brain uses internal monitoring signals associated with oculomotor commands to predict the visual consequences of corresponding eye movements during active exploration. Visual scenes typically contain a rich structure of spatial relational information, providing additional cues that may help disambiguate self-induced from external changes of perceptual input. We reasoned that a weighted integration of these two inherently noisy sources of information should lead to better perceptual estimates. Volunteer subjects performed a simple perceptual decision on the apparent displacement of a visual target, jumping unpredictably in sync with a saccadic eye movement. In a critical test condition, the target was presented together with a flanker object, where perceptual decisions could take into account the spatial distance between target and flanker object. Here, precision was better compared to control conditions in which target displacements could only be estimated from either extraretinal or visual relational information alone. Our findings suggest that under natural conditions, integration of visual space across eye movements is based upon close to optimal integration of both retinal and extraretinal pieces of information.  相似文献   

8.
Computations in the early visual cortex.   总被引:1,自引:0,他引:1  
This paper reviews some of the recent neurophysiological studies that explore the variety of visual computations in the early visual cortex in relation to geometric inference, i.e. the inference of contours, surfaces and shapes. It attempts to draw connections between ideas from computational vision and findings from awake primate electrophysiology. In the classical feed-forward, modular view of visual processing, the early visual areas (LGN, V1 and V2) are modules that serve to extract local features, while higher extrastriate areas are responsible for shape inference and invariant object recognition. However, recent findings in primate early visual systems reveal that the computations in the early visual cortex are rather complex and dynamic, as well as interactive and plastic, subject to influence from global context, higher order perceptual inference, task requirement and behavioral experience. The evidence argues that the early visual cortex does not merely participate in the first stage of visual processing, but is involved in many levels of visual computation.  相似文献   

9.
We propose a strategy for early vision which tailors visual channels to the object-oriented characteristics of natural scenes. This strategy involves essentially two types of channel, one for encoding the locally dominant edges which form the boundaries of 'objects', and another for 'filling in' the regions within them. The selection of contrasts which characterize object boundaries rather than textural detail can be enhanced by making an estimate local of contrast, and setting a threshold accordingly. This procedure and other aspects of the model were first suggested by observations of insect visual cells.  相似文献   

10.
From a few presentations of an object, perceptual systems are able to extract invariant properties such that novel presentations are immediately recognized. This may be enabled by inferring the set of all representations equivalent under certain transformations. We implemented this principle in a neurodynamic model that stores activity patterns representing transformed versions of the same object in a distributed fashion within maps, such that translation across the map corresponds to the relevant transformation. When a pattern on the map is activated, this causes activity to spread out as a wave across the map, activating all the transformed versions represented. Computational studies illustrate the efficacy of the proposed mechanism. The model rapidly learns and successfully recognizes rotated and scaled versions of a visual representation from a few prior presentations. For topographical maps such as primary visual cortex, the mechanism simultaneously represents identity and variation of visual percepts whose features change through time.  相似文献   

11.
A variety of similarities between visual and haptic object recognition suggests that the two modalities may share common representations. However, it is unclear whether such common representations preserve low-level perceptual features or whether transfer between vision and haptics is mediated by high-level, abstract representations. Two experiments used a sequential shape-matching task to examine the effects of size changes on unimodal and crossmodal visual and haptic object recognition. Participants felt or saw 3D plastic models of familiar objects. The two objects presented on a trial were either the same size or different sizes and were the same shape or different but similar shapes. Participants were told to ignore size changes and to match on shape alone. In Experiment 1, size changes on same-shape trials impaired performance similarly for both visual-to-visual and haptic-to-haptic shape matching. In Experiment 2, size changes impaired performance on both visual-to-haptic and haptic-to-visual shape matching and there was no interaction between the cost of size changes and direction of transfer. Together the unimodal and crossmodal matching results suggest that the same, size-specific perceptual representations underlie both visual and haptic object recognition, and indicate that crossmodal memory for objects must be at least partly based on common perceptual representations.  相似文献   

12.
After a cerebral infarction, some patients acutely demonstrate contralateral hemiplegia, or aphasia. Those are the obvious symptoms of a cerebral infarction. However, less visible but burdensome consequences may go unnoticed without closer investigation. The importance of a thorough clinical examination is exemplified by a single case study of a 72-year-old, right-handed male. Two years before he had suffered from an ischemic stroke in the territory of the left posterior cerebral artery, with right homonymous hemianopia and global alexia (i.e., impairment in letter recognition and profound impairment of reading) without agraphia. Naming was impaired on visual presentation (20%-39% correct), but improved significantly after tactile presentation (87% correct) or verbal definition (89%). Pre-semantic visual processing was normal (correct matching of different views of the same object), as was his access to structural knowledge from vision (he reliably distinguished real objects from non-objects). On a colour decision task he reliably indicated which of two items was coloured correctly. Though he was unable to mime how visually presented objects were used, he more reliably matched pictures of objects with pictures of a mime artist gesturing the use of the object. He obtained normal scores on word definition (WAIS-III), synonym judgment and word-picture matching tasks with perceptual and semantic distractors. He however failed when he had to match physically dissimilar specimens of the same object or when he had to decide which two of five objects were related associatively (Pyramids and Palm Trees Test). The patient thus showed a striking contrast in his intact ability to access knowledge of object shape or colour from vision and impaired functional and associative knowledge. As a result, he could not access a complete semantic representation, required for activating phonological representations to name visually presented objects. The pattern of impairments and preserved abilities is considered to be a specific difficulty to access a full semantic representation from an intact structural representation of visually presented objects, i.e., a form of visual object agnosia.  相似文献   

13.
Hung CC  Carlson ET  Connor CE 《Neuron》2012,74(6):1099-1113
The basic, still unanswered question about visual object representation is this: what specific information is encoded by neural signals? Theorists have long predicted that neurons would encode medial axis or skeletal object shape, yet recent studies reveal instead neural coding of boundary or surface shape. Here, we addressed this theoretical/experimental disconnect, using adaptive shape sampling to demonstrate explicit coding of medial axis shape in high-level object cortex (macaque monkey inferotemporal cortex or IT). Our metric shape analyses revealed a coding continuum, along which most neurons represent a configuration of both medial axis and surface components. Thus, IT response functions embody a rich basis set for simultaneously representing skeletal and external shape of complex objects. This would be especially useful for representing biological shapes, which are often characterized by both complex, articulated skeletal structure and specific surface features.  相似文献   

14.
We investigated the geometric representations underlying the perception of 2-D contour curvature. 88 arcs representing lower and upper halves of concentric circles, or halves of ellipses derived mathematically through planar projection by affinity with the circles, a special case of Newton's transform, were generated to produce curved line segments with negative and positive curvature and varying sagitta (sag) and/or aspect ratio. Aspect ratio is defined here as the ratio between the sagitta and the chord-length of a given arc. The geometric properties of the arcs suggest a regrouping into four structural models. The 88 stimuli were presented in random order to 16 observers eight of whom were experienced in the mathematical and visual analysis of 2-D curvature ('expert observers'), and eight of whom were not ('non-expert observers'). Observers had to give a number, on a psychophysical scale from 0 to 10, that was to reflect the magnitude of curvature they perceived in a given arc. The results show that the subjective magnitude of curvature increases exponentially with the aspect ratio and linearly with the sagitta of the arcs for both experts and nonexperts. Statistical analysis of the correlation coefficients of linear fits to individual data represented on a logarithmic scale reveals significantly higher correlation coefficients for aspect ratio than for sagitta. The difference is not significant when curves with the longest chords only (7 degrees -10 degrees ) are considered. The geometric model that produces the best psychometric functions is described by a combination of arcs of vertically and horizontally oriented ellipses, indicating that perceptual sensations of 2-D contour curvature are based on geometric representations that suggest properties of 3-D structures. A 'buckled bar model' is shown to optimally account for the perceptual data of all observers with the exception of one expert. His perceptual data can be linked to a more analytical, less 'naturalistic' representation originating from a specific perceptual experience, which is discussed. It is concluded that the structural properties of 'real' objects are likely to determine even the most basic geometric representations underlying the perception of curvature in 2-D images. A specific perceptual learning experience may engender changes in such representations.  相似文献   

15.
The process of perception requires not only the brain''s receipt of sensory data but also the meaningful organization of that data in relation to the perceptual experience held in memory. Although it typically results in a conscious percept, the process of perception is not fully conscious. Research on the neural substrates of human visual perception has suggested that regions of limbic cortex, including the medial orbital frontal cortex (mOFC), may contribute to intuitive judgments about perceptual events, such as guessing whether an object might be present in a briefly presented fragmented drawing. Examining dense array measures of cortical electrical activity during a modified Waterloo Gestalt Closure Task, results show, as expected, that activity in medial orbital frontal electrical responses (about 250 ms) was associated with intuitive judgments. Activity in the right temporal-parietal-occipital (TPO) region was found to predict mOFC (∼150 ms) activity and, in turn, was subsequently influenced by the mOFC at a later time (∼300 ms). The initial perception of gist or meaning of a visual stimulus in limbic networks may thus yield reentrant input to the visual areas to influence continued development of the percept. Before perception is completed, the initial representation of gist may support intuitive judgments about the ongoing perceptual process.  相似文献   

16.
One of the paradoxes of vision is that the world as it appears to us and the image on the retina at any moment are not much like each other. The visual world seems to be extensive and continuous across time. However, the manner in which we sample the visual environment is neither extensive nor continuous. How does the brain reconcile these differences? Here, we consider existing evidence from both static and dynamic viewing paradigms together with the logical requirements of any representational scheme that would be able to support active behaviour. While static scene viewing paradigms favour extensive, but perhaps abstracted, memory representations, dynamic settings suggest sparser and task-selective representation. We suggest that in dynamic settings where movement within extended environments is required to complete a task, the combination of visual input, egocentric and allocentric representations work together to allow efficient behaviour. The egocentric model serves as a coding scheme in which actions can be planned, but also offers a potential means of providing the perceptual stability that we experience.  相似文献   

17.
Xu J  Yang Z  Tsien JZ 《PloS one》2010,5(12):e15796
Visual saliency is the perceptual quality that makes some items in visual scenes stand out from their immediate contexts. Visual saliency plays important roles in natural vision in that saliency can direct eye movements, deploy attention, and facilitate tasks like object detection and scene understanding. A central unsolved issue is: What features should be encoded in the early visual cortex for detecting salient features in natural scenes? To explore this important issue, we propose a hypothesis that visual saliency is based on efficient encoding of the probability distributions (PDs) of visual variables in specific contexts in natural scenes, referred to as context-mediated PDs in natural scenes. In this concept, computational units in the model of the early visual system do not act as feature detectors but rather as estimators of the context-mediated PDs of a full range of visual variables in natural scenes, which directly give rise to a measure of visual saliency of any input stimulus. To test this hypothesis, we developed a model of the context-mediated PDs in natural scenes using a modified algorithm for independent component analysis (ICA) and derived a measure of visual saliency based on these PDs estimated from a set of natural scenes. We demonstrated that visual saliency based on the context-mediated PDs in natural scenes effectively predicts human gaze in free-viewing of both static and dynamic natural scenes. This study suggests that the computation based on the context-mediated PDs of visual variables in natural scenes may underlie the neural mechanism in the early visual cortex for detecting salient features in natural scenes.  相似文献   

18.
We are surrounded by surfaces that we perceive by visual means. Understanding the basic principles behind this perceptual process is a central theme in visual psychology, psychophysics, and computational vision. In many of the computational models employed in the past, it has been assumed that a metric representation of physical space can be derived by visual means. Psychophysical experiments, as well as computational considerations, can convince us that the perception of space and shape has a much more complicated nature, and that only a distorted version of actual, physical space can be computed. This paper develops a computational geometric model that explains why such distortion might take place. The basic idea is that, both in stereo and motion, we perceive the world from multiple views. Given the rigid transformation between the views and the properties of the image correspondence, the depth of the scene can be obtained. Even a slight error in the rigid transformation parameters causes distortion of the computed depth of the scene. The unified framework introduced here describes this distortion in computational terms. We characterize the space of distortions by its level sets, that is, we characterize the systematic distortion via a family of iso-distortion surfaces which describes the locus over which depths are distorted by some multiplicative factor. Given that humans' estimation of egomotion or estimation of the extrinsic parameters of the stereo apparatus is likely to be imprecise, the framework is used to explain a number of psychophysical experiments on the perception of depth from motion or stereo. Received: 9 January 1997 / Accepted in revised form: 8 July 1997  相似文献   

19.
20.
Hierarchical generative models, such as Bayesian networks, and belief propagation have been shown to provide a theoretical framework that can account for perceptual processes, including feedforward recognition and feedback modulation. The framework explains both psychophysical and physiological experimental data and maps well onto the hierarchical distributed cortical anatomy. However, the complexity required to model cortical processes makes inference, even using approximate methods, very computationally expensive. Thus, existing object perception models based on this approach are typically limited to tree-structured networks with no loops, use small toy examples or fail to account for certain perceptual aspects such as invariance to transformations or feedback reconstruction. In this study we develop a Bayesian network with an architecture similar to that of HMAX, a biologically-inspired hierarchical model of object recognition, and use loopy belief propagation to approximate the model operations (selectivity and invariance). Crucially, the resulting Bayesian network extends the functionality of HMAX by including top-down recursive feedback. Thus, the proposed model not only achieves successful feedforward recognition invariant to noise, occlusions, and changes in position and size, but is also able to reproduce modulatory effects such as illusory contour completion and attention. Our novel and rigorous methodology covers key aspects such as learning using a layerwise greedy algorithm, combining feedback information from multiple parents and reducing the number of operations required. Overall, this work extends an established model of object recognition to include high-level feedback modulation, based on state-of-the-art probabilistic approaches. The methodology employed, consistent with evidence from the visual cortex, can be potentially generalized to build models of hierarchical perceptual organization that include top-down and bottom-up interactions, for example, in other sensory modalities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号