首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.  相似文献   

2.
The internal representation of solid shape with respect to vision   总被引:11,自引:0,他引:11  
It is argued that the internal model of any object must take the form of a function, such that for any intended action the resulting reafference is predictable. This function can be derived explicitly for the case of visual perception of rigid bodies by ambulant observers. The function depends on physical causation, not physiology; consequently, one can make a priori statements about possible internal models. A posteriori it seems likely that the orientation sensitive units described by Hubel and Wiesel constitute a physiological substrate subserving the extraction of the invariants of this function. The function is used to define a measure for the visual complexity of solid shape. Relations with Gestalt theories of perception are discussed.  相似文献   

3.
Several theories propose that the cortex implements an internal model to explain, predict, and learn about sensory data, but the nature of this model is unclear. One condition that could be highly informative here is Charles Bonnet syndrome (CBS), where loss of vision leads to complex, vivid visual hallucinations of objects, people, and whole scenes. CBS could be taken as indication that there is a generative model in the brain, specifically one that can synthesise rich, consistent visual representations even in the absence of actual visual input. The processes that lead to CBS are poorly understood. Here, we argue that a model recently introduced in machine learning, the deep Boltzmann machine (DBM), could capture the relevant aspects of (hypothetical) generative processing in the cortex. The DBM carries both the semantics of a probabilistic generative model and of a neural network. The latter allows us to model a concrete neural mechanism that could underlie CBS, namely, homeostatic regulation of neuronal activity. We show that homeostatic plasticity could serve to make the learnt internal model robust against e.g. degradation of sensory input, but overcompensate in the case of CBS, leading to hallucinations. We demonstrate how a wide range of features of CBS can be explained in the model and suggest a potential role for the neuromodulator acetylcholine. This work constitutes the first concrete computational model of CBS and the first application of the DBM as a model in computational neuroscience. Our results lend further credence to the hypothesis of a generative model in the brain.  相似文献   

4.
Over successive stages, the ventral visual system of the primate brain develops neurons that respond selectively to particular objects or faces with translation, size and view invariance. The powerful neural representations found in Inferotemporal cortex form a remarkably rapid and robust basis for object recognition which belies the difficulties faced by the system when learning in natural visual environments. A central issue in understanding the process of biological object recognition is how these neurons learn to form separate representations of objects from complex visual scenes composed of multiple objects. We show how a one-layer competitive network comprised of ‘spiking’ neurons is able to learn separate transformation-invariant representations (exemplified by one-dimensional translations) of visual objects that are always seen together moving in lock-step, but separated in space. This is achieved by combining ‘Mexican hat’ functional lateral connectivity with cell firing-rate adaptation to temporally segment input representations of competing stimuli through anti-phase oscillations (perceptual cycles). These spiking dynamics are quickly and reliably generated, enabling selective modification of the feed-forward connections to neurons in the next layer through Spike-Time-Dependent Plasticity (STDP), resulting in separate translation-invariant representations of each stimulus. Variations in key properties of the model are investigated with respect to the network’s ability to develop appropriate input representations and subsequently output representations through STDP. Contrary to earlier rate-coded models of this learning process, this work shows how spiking neural networks may learn about more than one stimulus together without suffering from the ‘superposition catastrophe’. We take these results to suggest that spiking dynamics are key to understanding biological visual object recognition.  相似文献   

5.
People learn modality-independent, conceptual representations from modality-specific sensory signals. Here, we hypothesize that any system that accomplishes this feat will include three components: a representational language for characterizing modality-independent representations, a set of sensory-specific forward models for mapping from modality-independent representations to sensory signals, and an inference algorithm for inverting forward models—that is, an algorithm for using sensory signals to infer modality-independent representations. To evaluate this hypothesis, we instantiate it in the form of a computational model that learns object shape representations from visual and/or haptic signals. The model uses a probabilistic grammar to characterize modality-independent representations of object shape, uses a computer graphics toolkit and a human hand simulator to map from object representations to visual and haptic features, respectively, and uses a Bayesian inference algorithm to infer modality-independent object representations from visual and/or haptic signals. Simulation results show that the model infers identical object representations when an object is viewed, grasped, or both. That is, the model’s percepts are modality invariant. We also report the results of an experiment in which different subjects rated the similarity of pairs of objects in different sensory conditions, and show that the model provides a very accurate account of subjects’ ratings. Conceptually, this research significantly contributes to our understanding of modality invariance, an important type of perceptual constancy, by demonstrating how modality-independent representations can be acquired and used. Methodologically, it provides an important contribution to cognitive modeling, particularly an emerging probabilistic language-of-thought approach, by showing how symbolic and statistical approaches can be combined in order to understand aspects of human perception.  相似文献   

6.
A key challenge underlying theories of vision is how the spatially restricted, retinotopically represented feature analysis can be integrated to form abstract, coordinate-free object models. A resolution likely depends on the use of intermediate-level representations which can on the one hand be populated by local features and on the other hand be used as atomic units underlying the formation of, and interaction with, object hypotheses. The precise structure of this intermediate representation derives from the varied requirements of a range of visual tasks which motivate a significant role for incorporating a geometry of visual form. The need to integrate input from features capturing surface properties such as texture, shading, motion, color, etc., as well as from features capturing surface discontinuities such as silhouettes, T-junctions, etc., implies a geometry which captures both regional and boundary aspects. Curves, as a geometric model of boundaries, have been extensively used as an intermediate representation in computational, perceptual, and physiological studies, while the use of the medial axis (MA) has been popular mainly in computer vision as a geometric region-based model of the interior of closed boundaries. We extend the traditional model of the MA to represent images, where each MA segment represents a region of the image which we call a visual fragment. We present a unified theory of perceptual grouping and object recognition where through various sequences of transformations of the MA representation, visual fragments are grouped in various configurations to form object hypotheses, and are related to stored models. The mechanisms underlying both the computation and the transformation of the MA is a lateral wave propagation model. Recent psychophysical experiments depicting contrast sensitivity map peaks at the medial axes of stimuli, and experiments on perceptual filling-in, and brightness induction and modulation, are consistent with both the use of an MA representation and a propagation-based scheme. Also, recent neurophysiological recordings in V1 correlate with the MA hypothesis and a horizontal propagation scheme. This evidence supports a geometric computational paradigm for processing sensory data where both dynamic in-plane propagation and feedforward-feedback connections play an integral role.  相似文献   

7.
Visual perception is burdened with a highly discontinuous input stream arising from saccadic eye movements. For successful integration into a coherent representation, the visuomotor system needs to deal with these self-induced perceptual changes and distinguish them from external motion. Forward models are one way to solve this problem where the brain uses internal monitoring signals associated with oculomotor commands to predict the visual consequences of corresponding eye movements during active exploration. Visual scenes typically contain a rich structure of spatial relational information, providing additional cues that may help disambiguate self-induced from external changes of perceptual input. We reasoned that a weighted integration of these two inherently noisy sources of information should lead to better perceptual estimates. Volunteer subjects performed a simple perceptual decision on the apparent displacement of a visual target, jumping unpredictably in sync with a saccadic eye movement. In a critical test condition, the target was presented together with a flanker object, where perceptual decisions could take into account the spatial distance between target and flanker object. Here, precision was better compared to control conditions in which target displacements could only be estimated from either extraretinal or visual relational information alone. Our findings suggest that under natural conditions, integration of visual space across eye movements is based upon close to optimal integration of both retinal and extraretinal pieces of information.  相似文献   

8.
视觉皮层复杂细胞时空编码特性   总被引:6,自引:0,他引:6  
针对输入在视皮层的编码表达,在地空滤波窗口基础上构建了一个复杂细胞时空编码模型,对几种特殊的输入函数进行了编码仿真实验,结果说明了视皮层复杂细胞时空整合编码序列的精细时间结构进行视觉输入的神经表象。  相似文献   

9.
Deciding what constitutes an object, and what background, is an essential task for the visual system. This presents a conundrum: averaging over the visual scene is required to obtain a precise signal for object segregation, but segregation is required to define the region over which averaging should take place. Depth, obtained via binocular disparity (the differences between two eyes’ views), could help with segregation by enabling identification of object and background via differences in depth. Here, we explore depth perception in disparity-defined objects. We show that a simple object segregation rule, followed by averaging over that segregated area, can account for depth estimation errors. To do this, we compared objects with smoothly varying depth edges to those with sharp depth edges, and found that perceived peak depth was reduced for the former. A computational model used a rule based on object shape to segregate and average over a central portion of the object, and was able to emulate the reduction in perceived depth. We also demonstrated that the segregated area is not predefined but is dependent on the object shape. We discuss how this segregation strategy could be employed by animals seeking to deter binocular predators.This article is part of the themed issue ‘Vision in our three-dimensional world’.  相似文献   

10.
Visual object recognition and sensitivity to image features are largely influenced by contextual inputs. We study influences by contextual bars on the bias to perceive or infer the presence of a target bar, rather than on the sensitivity to image features. Human observers judged from a briefly presented stimulus whether a target bar of a known orientation and shape is present at the center of a display, given a weak or missing input contrast at the target location with or without a context of other bars. Observers are more likely to perceive a target when the context has a weaker rather than stronger contrast. When the context can perceptually group well with the would-be target, weak contrast contextual bars bias the observers to perceive a target relative to the condition without contexts, as if to fill in the target. Meanwhile, high-contrast contextual bars, regardless of whether they group well with the target, bias the observers to perceive no target. A Bayesian model of visual inference is shown to account for the data well, illustrating that the context influences the perception in two ways: (1) biasing observers' prior belief that a target should be present according to visual grouping principles, and (2) biasing observers' internal model of the likely input contrasts caused by a target bar. According to this model, our data suggest that the context does not influence the perceived target contrast despite its influence on the bias to perceive the target's presence, thereby suggesting that cortical areas beyond the primary visual cortex are responsible for the visual inferences.  相似文献   

11.
12.
The internal noise present in a linear system can be quantified by the equivalent noise method. By measuring the effect that applying external noise to the system’s input has on its output one can estimate the variance of this internal noise. By applying this simple “linear amplifier” model to the human visual system, one can entirely explain an observer’s detection performance by a combination of the internal noise variance and their efficiency relative to an ideal observer. Studies using this method rely on two crucial factors: firstly that the external noise in their stimuli behaves like the visual system’s internal noise in the dimension of interest, and secondly that the assumptions underlying their model are correct (e.g. linearity). Here we explore the effects of these two factors while applying the equivalent noise method to investigate the contrast sensitivity function (CSF). We compare the results at 0.5 and 6 c/deg from the equivalent noise method against those we would expect based on pedestal masking data collected from the same observers. We find that the loss of sensitivity with increasing spatial frequency results from changes in the saturation constant of the gain control nonlinearity, and that this only masquerades as a change in internal noise under the equivalent noise method. Part of the effect we find can be attributed to the optical transfer function of the eye. The remainder can be explained by either changes in effective input gain, divisive suppression, or a combination of the two. Given these effects the efficiency of our observers approaches the ideal level. We show the importance of considering these factors in equivalent noise studies.  相似文献   

13.
Active exploration of large-scale environments leads to better learning of spatial layout than does passive observation [1] [2] [3]. But active exploration might also help us to remember the appearance of individual objects in a scene. In fact, when we encounter new objects, we often manipulate them so that they can be seen from a variety of perspectives. We present here the first evidence that active control of the visual input in this way facilitates later recognition of objects. Observers who actively rotated novel, three-dimensional objects on a computer screen later showed more efficient visual recognition than observers who passively viewed the exact same sequence of images of these virtual objects. During active exploration, the observers focused mainly on the 'side' or 'front' views of the objects (see also [4] [5] [6]). The results demonstrate that how an object is represented for later recognition is influenced by whether or not one controls the presentation of visual input during learning.  相似文献   

14.
Imaging techniques are a cornerstone of contemporary biology. Over the last decades, advances in microscale imaging techniques have allowed fascinating new insights into cell and tissue morphology and internal anatomy of organisms across kingdoms. However, most studies so far provided snapshots of given reference taxa, describing organs and tissues under “idealized” conditions. Surprisingly, there is an almost complete lack of studies investigating how an organism′s internal morphology changes in response to environmental drivers. Consequently, ecology as a scientific discipline has so far almost neglected the possibilities arising from modern microscale imaging techniques. Here, we provide an overview of recent developments of X‐ray computed tomography as an affordable, simple method of high spatial resolution, allowing insights into three‐dimensional anatomy both in vivo and ex vivo. We review ecological studies using this technique to investigate the three‐dimensional internal structure of organisms. In addition, we provide practical comparisons between different preparation techniques for maximum contrast and tissue differentiation. In particular, we consider the novel modality of phase contrast by self‐interference of the X‐ray wave behind an object (i.e., phase contrast by free space propagation). Using the cricket Acheta domesticus (L.) as model organism, we found that the combination of FAE fixative and iodine staining provided the best results across different tissues. The drying technique also affected contrast and prevented artifacts in specific cases. Overall, we found that for the interests of ecological studies, X‐ray computed tomography is useful when the tissue or structure of interest has sufficient contrast that allows for an automatic or semiautomatic segmentation. In particular, we show that reconstruction schemes which exploit phase contrast can yield enhanced image quality. Combined with suitable specimen preparation and automated analysis, X‐ray CT can therefore become a promising quantitative 3D imaging technique to study organisms′ responses to environmental drivers, in both ecology and evolution.  相似文献   

15.
Sensory information from different modalities is processed in parallel, and then integrated in associative brain areas to improve object identification and the interpretation of sensory experiences. The Superior Colliculus (SC) is a midbrain structure that plays a critical role in integrating visual, auditory, and somatosensory input to assess saliency and promote action. Although the response properties of the individual SC neurons to visuoauditory stimuli have been characterized, little is known about the spatial and temporal dynamics of the integration at the population level. Here we recorded the response properties of SC neurons to spatially restricted visual and auditory stimuli using large-scale electrophysiology. We then created a general, population-level model that explains the spatial, temporal, and intensity requirements of stimuli needed for sensory integration. We found that the mouse SC contains topographically organized visual and auditory neurons that exhibit nonlinear multisensory integration. We show that nonlinear integration depends on properties of auditory but not visual stimuli. We also find that a heuristically derived nonlinear modulation function reveals conditions required for sensory integration that are consistent with previously proposed models of sensory integration such as spatial matching and the principle of inverse effectiveness.  相似文献   

16.
Human beings have the capacity to recognize objects in natural visual scenes with high efficiency despite the complexity of such scenes, which usually contain multiple objects. One possible mechanism for dealing with this problem is selective attention. Psychophysical evidence strongly suggests that selective attention can enhance the spatial resolution in the input region corresponding to the focus of attention. In this work we adopt a computational neuroscience perspective to analyze the attentional enhancement of spatial resolution in the area containing the objects of interest. We extend and apply the computational model of Deco and Schürmann (2000), which consists of several modules with feedforward and feedback interconnections describing the mutual links between different areas of the visual cortex. Each module analyses the visual input with different spatial resolution and can be thought of as a hierarchical predictor at a given level of resolution. Moreover, each hierarchical predictor has a submodule that consists of a group of neurons performing a biologically based 2D Gabor wavelet transformation at a given resolution level. The attention control decides in which local regions the spatial resolution should be enhanced in a serial fashion. In this sense, the scene is first analyzed at a coarse resolution level, and the focus of attention enhances iteratively the resolution at the location of an object until the object is identified. We propose and simulate new psychophysical experiments where the effect of the attentional enhancement of spatial resolution can be demonstrated by predicting different reaction time profiles in visual search experiments where the target and distractors are defined at different levels of resolution.  相似文献   

17.
The unique organism project was designed as a culminating assessment for a biological classification unit in a middle school setting. Students developed a model to represent their unique organism. Using the model, students were required to demonstrate how their unique organism interacts with its environment, and how its internal and external structure and organization allowed it to carry out those interactions. The NGSS Cross Cutting Concepts of structure and function, systems, and system models along with the Science & Engineering Practice of constructing models were integrated and emphasized throughout the unit.  相似文献   

18.
A visual model for object detection is proposed. In order to make the detection ability comparable with existing technical methods for object detection, an evolution equation of neurons in the model is derived from the computational principle of active contours. The hierarchical structure of the model emerges naturally from the evolution equation. One drawback involved with initial values of active contours is alleviated by introducing and formulating convexity, which is a visual property. Numerical experiments show that the proposed model detects objects with complex topologies and that it is tolerant of noise. A visual attention model is introduced into the proposed model. Other simulations show that the visual properties of the model are consistent with the results of psychological experiments that disclose the relation between figure–ground reversal and visual attention. We also demonstrate that the model tends to perceive smaller regions as figures, which is a characteristic observed in human visual perception.This work was partially supported by Grants-in-Aid for Scientific Research (#14780254) from Japan Society of Promotion of Science.  相似文献   

19.
We present the elements of a mathematical computational model that reflects the experimental finding that the time-scale of a neuron is not fixed; but rather varies with the history of its stimulus. Unlike most physiological models, there are no pre-determined rates associated with transitions between states of the system nor are there pre-determined constants associated with adaptation rates; instead, the model is a kind of "modulating automata" where the rates emerge from the history of the system itself. We focus in this paper on the temporal dynamics of a neuron and show how a simple internal structure will give rise to complex temporal behavior. The internal structure modeled here is an abstraction of a reasonably well-understood physiological structure. We also suggest that this behavior can be used to transform a "rate" code into a "temporal one".  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号