首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Receptive fields of simple cells in the primate visual cortex were well fit in the space and time domains by the Gaussian Derivative (GD) model for spatio-temporal vision. All 23 fields in the data sample could be fit by one equation. varying only a single shape number and nine geometric transformation parameters. A difference-of-offset-Gaussians (DOOG) mechanism for the GD model also fit the data well. Other models tested did not fit the data as well as or as succinctly, or failed to converge on a unique solution, indicating over-parameterization. An efficient computational algorithm was found for the GD model which produced robust estimates of the direction and speed of moving objects in real scenes.  相似文献   

2.
Perception of objects and motions in the visual scene is one of the basic problems in the visual system. There exist ‘What’ and ‘Where’ pathways in the superior visual cortex, starting from the simple cells in the primary visual cortex. The former is able to perceive objects such as forms, color, and texture, and the latter perceives ‘where’, for example, velocity and direction of spatial movement of objects. This paper explores brain-like computational architectures of visual information processing. We propose a visual perceptual model and computational mechanism for training the perceptual model. The computational model is a three-layer network. The first layer is the input layer which is used to receive the stimuli from natural environments. The second layer is designed for representing the internal neural information. The connections between the first layer and the second layer, called the receptive fields of neurons, are self-adaptively learned based on principle of sparse neural representation. To this end, we introduce Kullback-Leibler divergence as the measure of independence between neural responses and derive the learning algorithm based on minimizing the cost function. The proposed algorithm is applied to train the basis functions, namely receptive fields, which are localized, oriented, and bandpassed. The resultant receptive fields of neurons in the second layer have the characteristics resembling that of simple cells in the primary visual cortex. Based on these basis functions, we further construct the third layer for perception of what and where in the superior visual cortex. The proposed model is able to perceive objects and their motions with a high accuracy and strong robustness against additive noise. Computer simulation results in the final section show the feasibility of the proposed perceptual model and high efficiency of the learning algorithm.  相似文献   

3.
Perception of objects and motions in the visual scene is one of the basic problems in the visual system. There exist 'What' and 'Where' pathways in the superior visual cortex, starting from the simple cells in the primary visual cortex. The former is able to perceive objects such as forms, color, and texture, and the latter perceives 'where', for example, velocity and direction of spatial movement of objects. This paper explores brain-like computational architectures of visual information processing. We propose a visual perceptual model and computational mechanism for training the perceptual model. The compu- tational model is a three-layer network. The first layer is the input layer which is used to receive the stimuli from natural environments. The second layer is designed for representing the internal neural information. The connections between the first layer and the second layer, called the receptive fields of neurons, are self-adaptively learned based on principle of sparse neural representation. To this end, we introduce Kullback-Leibler divergence as the measure of independence between neural responses and derive the learning algorithm based on minimizing the cost function. The proposed algorithm is applied to train the basis functions, namely receptive fields, which are localized, oriented, and bandpassed. The resultant receptive fields of neurons in the second layer have the characteristics resembling that of simple cells in the primary visual cortex. Based on these basis functions, we further construct the third layer for perception of what and where in the superior visual cortex. The proposed model is able to perceive objects and their motions with a high accuracy and strong robustness against additive noise. Computer simulation results in the final section show the feasibility of the proposed perceptual model and high efficiency of the learning algorithm.  相似文献   

4.
Some computational theories of motion perception assume that the first stage en route to this perception is the local estimate of image velocity. However, this assumption is not supported by data from the primary visual cortex. Its motion sensitive cells are not selective to velocity, but rather are directionally selective and tuned to spatio-temporal frequencies. Accordingly, physiologically based theories start with filters selective to oriented spatio-temporal frequencies. This paper shows that computational and physiological theories do not necessarily conflict, because such filters may, as a population, compute velocity locally. To prove this point, we show how to combine the outputs of a class of frequency tuned filters to detect local image velocity. Furthermore, we show that the combination of filters may simulate 'Pattern' cells in the middle temporal area (MT), whereas each filter simulates primary visual cortex cells. These simulations include three properties of the primary cortex. First, the spatio-temporal frequency tuning curves of the individual filters display approximate space-time separability. Secondly, their direction-of-motion tuning curves depend on the distribution of orientations of the components of the Fourier decomposition and speed of the stimulus. Thirdly, the filters show facilitation and suppression for responses to apparent motions in the preferred and null directions, respectively. It is suggested that the MT's role is not to solve the aperture problem, but to estimate velocities from primary cortex information. The spatial integration that accounts for motion coherence may be postponed to a later cortical stage.  相似文献   

5.
Simple cells in the primary visual cortex process incoming visual information with receptive fields localized in space and time, bandpass in spatial and temporal frequency, tuned in orientation, and commonly selective for the direction of movement. It is shown that performing independent component analysis (ICA) on video sequences of natural scenes produces results with qualitatively similar spatio-temporal properties. Whereas the independent components of video resemble moving edges or bars, the independent component filters, i.e. the analogues of receptive fields, resemble moving sinusoids windowed by steady Gaussian envelopes. Contrary to earlier ICA results on static images, which gave only filters at the finest possible spatial scale, the spatio-temporal analysis yields filters at a range of spatial and temporal scales. Filters centred at low spatial frequencies are generally tuned to faster movement than those at high spatial frequencies.  相似文献   

6.
A model of motion sensitivity as observed in some cells of area V1 of the visual cortex is proposed. Motion sensitivity is achieved by a combination of different spatiotemporal receptive fields, in particular, spatial and temporal differentiators. The receptive fields emerge if a Hebbian learning rule is applied to the network. Similar to a Linsker model the network has a spatially convergent, linear feedforward structure. Additionally, however, delays omnipresent in the brain are incorporated in the model. The emerging spatiotemporal receptive fields are derived explicitly by extending the approach of MacKay and Miller. The response characteristic of the network is calculated in frequency space and shows that the network can be considered as a spacetime filter for motion in one direction. The emergence of different types of receptive field requires certain structural constraints regarding the spatial and temporal arborisation. These requirements can be derived from the theoretical analysis and might be compared with neuroanatomical data. In this way an explicit link between structure and function of the network is established.  相似文献   

7.
A receptive field constitutes a region in the visual field where a visual cell or a visual operator responds to visual stimuli. This paper presents a theory for what types of receptive field profiles can be regarded as natural for an idealized vision system, given a set of structural requirements on the first stages of visual processing that reflect symmetry properties of the surrounding world. These symmetry properties include (i) covariance properties under scale changes, affine image deformations, and Galilean transformations of space–time as occur for real-world image data as well as specific requirements of (ii) temporal causality implying that the future cannot be accessed and (iii) a time-recursive updating mechanism of a limited temporal buffer of the past as is necessary for a genuine real-time system. Fundamental structural requirements are also imposed to ensure (iv) mutual consistency and a proper handling of internal representations at different spatial and temporal scales. It is shown how a set of families of idealized receptive field profiles can be derived by necessity regarding spatial, spatio-chromatic, and spatio-temporal receptive fields in terms of Gaussian kernels, Gaussian derivatives, or closely related operators. Such image filters have been successfully used as a basis for expressing a large number of visual operations in computer vision, regarding feature detection, feature classification, motion estimation, object recognition, spatio-temporal recognition, and shape estimation. Hence, the associated so-called scale-space theory constitutes a both theoretically well-founded and general framework for expressing visual operations. There are very close similarities between receptive field profiles predicted from this scale-space theory and receptive field profiles found by cell recordings in biological vision. Among the family of receptive field profiles derived by necessity from the assumptions, idealized models with very good qualitative agreement are obtained for (i) spatial on-center/off-surround and off-center/on-surround receptive fields in the fovea and the LGN, (ii) simple cells with spatial directional preference in V1, (iii) spatio-chromatic double-opponent neurons in V1, (iv) space–time separable spatio-temporal receptive fields in the LGN and V1, and (v) non-separable space–time tilted receptive fields in V1, all within the same unified theory. In addition, the paper presents a more general framework for relating and interpreting these receptive fields conceptually and possibly predicting new receptive field profiles as well as for pre-wiring covariance under scaling, affine, and Galilean transformations into the representations of visual stimuli. This paper describes the basic structure of the necessity results concerning receptive field profiles regarding the mathematical foundation of the theory and outlines how the proposed theory could be used in further studies and modelling of biological vision. It is also shown how receptive field responses can be interpreted physically, as the superposition of relative variations of surface structure and illumination variations, given a logarithmic brightness scale, and how receptive field measurements will be invariant under multiplicative illumination variations and exposure control mechanisms.  相似文献   

8.
The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. This paper presents a theory for achieving basic invariance properties already at the level of receptive fields. Specifically, the presented framework comprises (i) local scaling transformations caused by objects of different size and at different distances to the observer, (ii) locally linearized image deformations caused by variations in the viewing direction in relation to the object, (iii) locally linearized relative motions between the object and the observer and (iv) local multiplicative intensity transformations caused by illumination variations. The receptive field model can be derived by necessity from symmetry properties of the environment and leads to predictions about receptive field profiles in good agreement with receptive field profiles measured by cell recordings in mammalian vision. Indeed, the receptive field profiles in the retina, LGN and V1 are close to ideal to what is motivated by the idealized requirements. By complementing receptive field measurements with selection mechanisms over the parameters in the receptive field families, it is shown how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields and support invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination. The theory can explain the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time, from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.  相似文献   

9.
Kandil FI  Lappe M 《PloS one》2007,2(2):e264
Spatio-temporal interpolation describes the ability of the visual system to perceive shapes as whole figures (Gestalts), even if they are moving behind narrow apertures, so that only thin slices of them meet the eye at any given point in time. The interpolation process requires registration of the form slices, as well as perception of the shape's global motion, in order to reassemble the slices in the correct order. The commonly proposed mechanism is a spatio-temporal motion detector with a receptive field, for which spatial distance and temporal delays are interchangeable, and which has generally been regarded as monocular. Here we investigate separately the nature of the motion and the form detection involved in spatio-temporal interpolation, using dichoptic masking and interocular presentation tasks. The results clearly demonstrate that the associated mechanisms for both motion and form are binocular rather than monocular. Hence, we question the traditional view according to which spatio-temporal interpolation is achieved by monocular first-order motion-energy detectors in favour of models featuring binocular motion and form detection.  相似文献   

10.
The visual cortex analyzes motion information along hierarchically arranged visual areas that interact through bidirectional interconnections. This work suggests a bio-inspired visual model focusing on the interactions of the cortical areas in which a new mechanism of feedforward and feedback processing are introduced. The model uses a neuromorphic vision sensor (silicon retina) that simulates the spike-generation functionality of the biological retina. Our model takes into account two main model visual areas, namely V1 and MT, with different feature selectivities. The initial motion is estimated in model area V1 using spatiotemporal filters to locally detect the direction of motion. Here, we adapt the filtering scheme originally suggested by Adelson and Bergen to make it consistent with the spike representation of the DVS. The responses of area V1 are weighted and pooled by area MT cells which are selective to different velocities, i.e. direction and speed. Such feature selectivity is here derived from compositions of activities in the spatio-temporal domain and integrating over larger space-time regions (receptive fields). In order to account for the bidirectional coupling of cortical areas we match properties of the feature selectivity in both areas for feedback processing. For such linkage we integrate the responses over different speeds along a particular preferred direction. Normalization of activities is carried out over the spatial as well as the feature domains to balance the activities of individual neurons in model areas V1 and MT. Our model was tested using different stimuli that moved in different directions. The results reveal that the error margin between the estimated motion and synthetic ground truth is decreased in area MT comparing with the initial estimation of area V1. In addition, the modulated V1 cell activations shows an enhancement of the initial motion estimation that is steered by feedback signals from MT cells.  相似文献   

11.
Biphasic neural response properties, where the optimal stimulus for driving a neural response changes from one stimulus pattern to the opposite stimulus pattern over short periods of time, have been described in several visual areas, including lateral geniculate nucleus (LGN), primary visual cortex (V1), and middle temporal area (MT). We describe a hierarchical model of predictive coding and simulations that capture these temporal variations in neuronal response properties. We focus on the LGN-V1 circuit and find that after training on natural images the model exhibits the brain's LGN-V1 connectivity structure, in which the structure of V1 receptive fields is linked to the spatial alignment and properties of center-surround cells in the LGN. In addition, the spatio-temporal response profile of LGN model neurons is biphasic in structure, resembling the biphasic response structure of neurons in cat LGN. Moreover, the model displays a specific pattern of influence of feedback, where LGN receptive fields that are aligned over a simple cell receptive field zone of the same polarity decrease their responses while neurons of opposite polarity increase their responses with feedback. This phase-reversed pattern of influence was recently observed in neurophysiology. These results corroborate the idea that predictive feedback is a general coding strategy in the brain.  相似文献   

12.

Background

Optic flow is an important cue for object detection. Humans are able to perceive objects in a scene using only kinetic boundaries, and can perform the task even when other shape cues are not provided. These kinetic boundaries are characterized by the presence of motion discontinuities in a local neighbourhood. In addition, temporal occlusions appear along the boundaries as the object in front covers the background and the objects that are spatially behind it.

Methodology/Principal Findings

From a technical point of view, the detection of motion boundaries for segmentation based on optic flow is a difficult task. This is due to the problem that flow detected along such boundaries is generally not reliable. We propose a model derived from mechanisms found in visual areas V1, MT, and MSTl of human and primate cortex that achieves robust detection along motion boundaries. It includes two separate mechanisms for both the detection of motion discontinuities and of occlusion regions based on how neurons respond to spatial and temporal contrast, respectively. The mechanisms are embedded in a biologically inspired architecture that integrates information of different model components of the visual processing due to feedback connections. In particular, mutual interactions between the detection of motion discontinuities and temporal occlusions allow a considerable improvement of the kinetic boundary detection.

Conclusions/Significance

A new model is proposed that uses optic flow cues to detect motion discontinuities and object occlusion. We suggest that by combining these results for motion discontinuities and object occlusion, object segmentation within the model can be improved. This idea could also be applied in other models for object segmentation. In addition, we discuss how this model is related to neurophysiological findings. The model was successfully tested both with artificial and real sequences including self and object motion.  相似文献   

13.
A spatio-temporal model of ganglion cell receptive fields is proposed on the basis of receptive field characteristics of cat retinal ganglion cells reported in our previous paper. The model consists of the linear and nonlinear mechanisms in the ganglion cell receptive field. The linear mechanism is assumed to be composed of antagonistic center and surround mechanisms. Then, by integrating these mechanisms we construct a spatio-temporal impulse response function of ganglion cell receptive field. Here we assume that spatio-temporal impulse response function may be factored into spatial and temporal terms. By Fouriertransforming the spatio-temporal impulse response function, we can obtain the spatio-temporal transfer function. Contrast sensitivity characteristics of X-and Y-cells in the cat retina may be explained by the transfer function.  相似文献   

14.
A theory of early motion processing in the human and primate visual system is presented which is based on the idea that spatio-temporal retinal image data is represented in primary visual cortex by a truncated 3D Taylor expansion that we refer to as a jet vector. This representation allows all the concepts of differential geometry to be applied to the analysis of visual information processing. We show in particular how the generalised Stokes theorem can be used to move from the calculation of derivatives of image brightness at a point to the calculation of image brightness differences on the boundary of a volume in space-time and how this can be generalised to apply to integrals of products of derivatives. We also provide novel interpretations of the roles of direction selective, bi-directional and pan-directional cells and of type I and type II cells in V5/MT.  相似文献   

15.
Increasingly systematic approaches to quantifying receptive fields in primary visual cortex, combined with inspired ideas about functional circuitry, non-linearities, and visual stimuli, are bringing new interest to classical problems. This includes the distinction and hierarchy between simple and complex cells, the mechanisms underlying the receptive field surround, and debates about optimal stimuli for mapping receptive fields. An important new problem arises from recent observations of stimulus-dependent spatial and temporal summation in primary visual cortex. It appears that the receptive field can no longer be considered unique, and we might have to relinquish this cherished notion as the embodiment of neuronal function in primary visual cortex.  相似文献   

16.
Visual neurons have spatial receptive fields that encode the positions of objects relative to the fovea. Because foveate animals execute frequent saccadic eye movements, this position information is constantly changing, even though the visual world is generally stationary. Interestingly, visual receptive fields in many brain regions have been found to exhibit changes in strength, size, or position around the time of each saccade, and these changes have often been suggested to be involved in the maintenance of perceptual stability. Crucial to the circuitry underlying perisaccadic changes in visual receptive fields is the superior colliculus (SC), a brainstem structure responsible for integrating visual and oculomotor signals. In this work we have studied the time-course of receptive field changes in the SC. We find that the distribution of the latencies of SC responses to stimuli placed outside the fixation receptive field is bimodal: The first mode is comprised of early responses that are temporally locked to the onset of the visual probe stimulus and stronger for probes placed closer to the classical receptive field. We suggest that such responses are therefore consistent with a perisaccadic rescaling, or enhancement, of weak visual responses within a fixed spatial receptive field. The second mode is more similar to the remapping that has been reported in the cortex, as responses are time-locked to saccade onset and stronger for stimuli placed in the postsaccadic receptive field location. We suggest that these two temporal phases of spatial updating may represent different sources of input to the SC.  相似文献   

17.
Pack CC  Born RT  Livingstone MS 《Neuron》2003,37(3):525-535
The analysis of object motion and stereoscopic depth are important tasks that are begun at early stages of the primate visual system. Using sparse white noise, we mapped the receptive field substructure of motion and disparity interactions in neurons in V1 and MT of alert monkeys. Interactions in both regions revealed subunits similar in structure to V1 simple cells. For both motion and stereo, the scale and shape of the receptive field substructure could be predicted from conventional tuning for bars or dot-field stimuli, indicating that the small-scale interactions were repeated across the receptive fields. We also found neurons in V1 and in MT that were tuned to combinations of spatial and temporal binocular disparities, suggesting a possible neural substrate for the perceptual Pulfrich phenomenon. Our observations constrain computational and developmental models of motion-stereo integration.  相似文献   

18.
亮度(luminance)是最基本的视觉信息.与其他视觉特征相比,由于视神经元对亮度刺激的反应较弱,并且许多神经元对均匀亮度无反应,对亮度信息编码的神经机制知之甚少.初级视皮层部分神经元对亮度的反应要慢于对比度反应,被认为是由边界对比度诱导的亮度知觉(brightness)的神经基础.我们的研究表明,初级视皮层许多神经元的亮度反应要快于对比度反应,并且这些神经元偏好低的空间频率、高的时间频率和高的运动速度,提示皮层下具有低空间频率和高运动速度通路的信息输入对产生初级视皮层神经元的亮度反应有贡献.已经知道初级视皮层神经元对空间频率反应的时间过程是从低空间频率到高空间频率,我们发现的早期亮度反应是对极低空间频率的反应,与这一时间过程是一致的,是这一从粗到细的视觉信息加工过程的第一步,揭示了处理最早的粗的视觉信息的神经基础.另外,初级视皮层含有偏好亮度下降和高运动速度的神经元,这群神经元的活动有助于在光照差的环境中检测高速运动的低亮度物体.  相似文献   

19.
How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.  相似文献   

20.
Computational models of primary visual cortex have demonstrated that principles of efficient coding and neuronal sparseness can explain the emergence of neurones with localised oriented receptive fields. Yet, existing models have failed to predict the diverse shapes of receptive fields that occur in nature. The existing models used a particular "soft" form of sparseness that limits average neuronal activity. Here we study models of efficient coding in a broader context by comparing soft and "bard" forms of neuronal sparseness. As a result of our analyses, we propose a novel network model for visual cortex. The model forms efficient visual representations in which the number of active neurones, rather than mean neuronal activity, is limited. This form of hard sparseness also economises cortical resources like synaptic memory and metabolic energy. Furthermore, our model accurately predicts the distribution of receptive field shapes found in the primary visual cortex of cat and monkey.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号