首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The vast majority of work in machine vision emphasizes the representation of perceived objects and events: it is these internal representations that incorporate the ''knowledge'' in knowledge-based vision or form the ''models'' in model-based vision. In this paper, we discuss simple machine vision systems developed by artificial evolution rather than traditional engineering design techniques, and note that the task of identifying internal representations within such systems is made difficult by the lack of an operational definition of representation at the causal mechanistic level. Consequently, we question the nature and indeed the existence of representations posited to be used within natural vision systems (i.e. animals). We conclude that representations argued for on a priori grounds by external observers of a particular vision system may well be illusory, and are at best place-holders for yet-to-be-identified causal mechanistic interactions. That is, applying the knowledge-based vision approach in the understanding of evolved systems (machines or animals) may well lead to theories and models that are internally consistent, computationally plausible, and entirely wrong.  相似文献   

2.
In order to quantitatively study object perception, be it perception by biological systems or by machines, one needs to create objects and object categories with precisely definable, preferably naturalistic, properties1. Furthermore, for studies on perceptual learning, it is useful to create novel objects and object categories (or object classes) with such properties2.Many innovative and useful methods currently exist for creating novel objects and object categories3-6 (also see refs. 7,8). However, generally speaking, the existing methods have three broad types of shortcomings.First, shape variations are generally imposed by the experimenter5,9,10, and may therefore be different from the variability in natural categories, and optimized for a particular recognition algorithm. It would be desirable to have the variations arise independently of the externally imposed constraints.Second, the existing methods have difficulty capturing the shape complexity of natural objects11-13. If the goal is to study natural object perception, it is desirable for objects and object categories to be naturalistic, so as to avoid possible confounds and special cases.Third, it is generally hard to quantitatively measure the available information in the stimuli created by conventional methods. It would be desirable to create objects and object categories where the available information can be precisely measured and, where necessary, systematically manipulated (or ''tuned''). This allows one to formulate the underlying object recognition tasks in quantitative terms.Here we describe a set of algorithms, or methods, that meet all three of the above criteria. Virtual morphogenesis (VM) creates novel, naturalistic virtual 3-D objects called ''digital embryos'' by simulating the biological process of embryogenesis14. Virtual phylogenesis (VP) creates novel, naturalistic object categories by simulating the evolutionary process of natural selection9,12,13. Objects and object categories created by these simulations can be further manipulated by various morphing methods to generate systematic variations of shape characteristics15,16. The VP and morphing methods can also be applied, in principle, to novel virtual objects other than digital embryos, or to virtual versions of real-world objects9,13. Virtual objects created in this fashion can be rendered as visual images using a conventional graphical toolkit, with desired manipulations of surface texture, illumination, size, viewpoint and background. The virtual objects can also be ''printed'' as haptic objects using a conventional 3-D prototyper.We also describe some implementations of these computational algorithms to help illustrate the potential utility of the algorithms. It is important to distinguish the algorithms from their implementations. The implementations are demonstrations offered solely as a ''proof of principle'' of the underlying algorithms. It is important to note that, in general, an implementation of a computational algorithm often has limitations that the algorithm itself does not have.Together, these methods represent a set of powerful and flexible tools for studying object recognition and perceptual learning by biological and computational systems alike. With appropriate extensions, these methods may also prove useful in the study of morphogenesis and phylogenesis.  相似文献   

3.
To recognize a previously seen object, the visual system must overcome the variability in the object''s appearance caused by factors such as illumination and pose. Developments in computer vision suggest that it may be possible to counter the influence of these factors, by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Daily life situations, however, typically require categorization, rather than recognition, of objects. Due to the open-ended character of both natural and artificial categories, categorization cannot rely on interpolation between stored examples. Nonetheless, knowledge of several representative members, or prototypes, of each of the categories of interest can still provide the necessary computational substrate for the categorization of new instances. The resulting representational scheme based on similarities to prototypes appears to be computationally viable, and is readily mapped onto the mechanisms of biological vision revealed by recent psychophysical and physiological studies.  相似文献   

4.
This paper reviews some of the contributions that work in computational vision has made to the study of biological vision systems. We concentrate on two areas where there has been strong interaction between computational and experimental studies: the use of binocular stereo to recover the distances to surfaces in space, and the recovery of the three-dimensional shape of objects from relative motion in the image. With regard to stereo, we consider models proposed for solving the stereo correspondence problem, focussing on the way in which physical properties of the world constrain possible methods of solution. We also show how critical observations regarding human stereo vision have helped to shape these models. With regard to the recovery of structure from motion, we focus on how the constraint of object rigidity has been used in computational models of this process.  相似文献   

5.
This paper evaluates the degree of saliency of texts in natural scenes using visual saliency models. A large scale scene image database with pixel level ground truth is created for this purpose. Using this scene image database and five state-of-the-art models, visual saliency maps that represent the degree of saliency of the objects are calculated. The receiver operating characteristic curve is employed in order to evaluate the saliency of scene texts, which is calculated by visual saliency models. A visualization of the distribution of scene texts and non-texts in the space constructed by three kinds of saliency maps, which are calculated using Itti''s visual saliency model with intensity, color and orientation features, is given. This visualization of distribution indicates that text characters are more salient than their non-text neighbors, and can be captured from the background. Therefore, scene texts can be extracted from the scene images. With this in mind, a new visual saliency architecture, named hierarchical visual saliency model, is proposed. Hierarchical visual saliency model is based on Itti''s model and consists of two stages. In the first stage, Itti''s model is used to calculate the saliency map, and Otsu''s global thresholding algorithm is applied to extract the salient region that we are interested in. In the second stage, Itti''s model is applied to the salient region to calculate the final saliency map. An experimental evaluation demonstrates that the proposed model outperforms Itti''s model in terms of captured scene texts.  相似文献   

6.
Honeybees (Apis mellifera) discriminate multiple object features such as colour, pattern and 2D shape, but it remains unknown whether and how bees recover three-dimensional shape. Here we show that bees can recognize objects by their three-dimensional form, whereby they employ an active strategy to uncover the depth profiles. We trained individual, free flying honeybees to collect sugar water from small three-dimensional objects made of styrofoam (sphere, cylinder, cuboids) or folded paper (convex, concave, planar) and found that bees can easily discriminate between these stimuli. We also tested possible strategies employed by the bees to uncover the depth profiles. For the card stimuli, we excluded overall shape and pictorial features (shading, texture gradients) as cues for discrimination. Lacking sufficient stereo vision, bees are known to use speed gradients in optic flow to detect edges; could the bees apply this strategy also to recover the fine details of a surface depth profile? Analysing the bees’ flight tracks in front of the stimuli revealed specific combinations of flight maneuvers (lateral translations in combination with yaw rotations), which are particularly suitable to extract depth cues from motion parallax. We modelled the generated optic flow and found characteristic patterns of angular displacement corresponding to the depth profiles of our stimuli: optic flow patterns from pure translations successfully recovered depth relations from the magnitude of angular displacements, additional rotation provided robust depth information based on the direction of the displacements; thus, the bees flight maneuvers may reflect an optimized visuo-motor strategy to extract depth structure from motion signals. The robustness and simplicity of this strategy offers an efficient solution for 3D-object-recognition without stereo vision, and could be employed by other flying insects, or mobile robots.  相似文献   

7.
In short, the model consists of a two-dimensional set of edge detecting units, modelled according to the zero-crossing detectors introduced first by Marr and Ullman (1981). These detectors are located peripherally in our synthetic vision system and are the input elements for an intelligent recurrent network. The purpose of that network is to recognize and categorize the previously detected contrast changes in a multi-resolution representation of the original image in such a manner that the original information will be decomposed into a relatively small numberN of well-defined edge primitives. The advantage of such a construction is that time-consuming pattern recognition has no longer to be done on the originally complex motion-blurred images of moving objects, but on a limited number of categorized forms. Based on a numberM of elementary feature attributes for each individual edge primitive, the model is then able to decompose each edge pattern into certain features. In this way anM-dimensional vector can be constructed for each edge. For each sequence of two successive frames a tensor can be calculated containing the distances (measured inM-dimensional feature space) between all features in both images. This procedure yields a set ofK—1 tensors for a sequence ofK images. After cross-correlation of allN ×M feature attributes from image (i) with those from image (i+1), wherei = 1, ...,K - 1, probability distributions can be computed. The final step is to search for maxima in these probability functions and then to construct from these extremes an optimal motion field. A number of simulation examples will be presented.  相似文献   

8.
The interpretation of structure from motion.   总被引:2,自引:0,他引:2  
The interpretation of structure from motion is examined from a computional point of view. The question addressed is how the three dimensional structure and motion of objects can be inferred from the two dimensional transformations of their projected images when no three dimensional information is conveyed by the individual projections. The following scheme is proposed: (i) divide the image into groups of four elements each; (ii) test each group for a rigid interpretation; (iii) combine the results obtained in (ii). It is shown that this scheme will correctly decompose scenes containing arbitrary rigid objects in motion, recovering their three dimensional structure and motion. The analysis is based primarily on the "structure from motion" theorem which states that the structure of four non-coplanar points is recoverable from three orthographic projections. The interpretation scheme is extended to cover perspective projections, and its psychological relevance is discussed.  相似文献   

9.
This paper presents a model-based method to efficiently simulate dynamic magnetic resonance imaging signals. Using an analytical spatiotemporal object model, the method can approximate time-varying k-space signals such as those from objects in motion and/or during dynamic contrast enhancement. Both rigid-body and non-rigid-body motions can be simulated using the proposed method. In addition, it can simulate data with arbitrary data sampling order and/or non-uniform k-space trajectory. A set of simulated images were compared with real data acquired from a rat model on a 4.7 T scanner to verify the model. The efficient simulation method is expected to be useful for rapid testing of various imaging and image analysis algorithms such as image reconstruction, image registration, motion compensation, and kinetic parameter mapping.  相似文献   

10.
The electric sense of mormyrids is often regarded as an adaptation to conditions unfavourable for vision and in these fish it has become the dominant sense for active orientation and communication tasks. With this sense, fish can detect and distinguish the electrical properties of the close environment, measure distance, perceive the 3-D shape of objects and discriminate objects according to distance or size and shape, irrespective of conductivity, thus showing a degree of abstraction regarding the interpretation of sensory stimuli. The physical properties of images projected on the sensory surface by the fish's own discharge reveal a "Mexican hat" opposing centre-surround profile. It is likely that computation of the image amplitude to slope ratio is used to measure distance, while peak width and slope give measures of shape and contrast. Modelling has been used to explore how the images of multiple objects superimpose in a complex manner. While electric images are by nature distributed, or 'blurred', behavioural strategies orienting sensory surfaces and the neural architecture of sensory processing networks both contribute to resolving potential ambiguities. Rostral amplification is produced by current funnelling in the head and chin appendage regions, where high density electroreceptor distributions constitute foveal regions. Central magnification of electroreceptive pathways from these regions particularly favours the detection of capacitive properties intrinsic to potential living prey. Swimming movements alter the amplitude and contrast of pre-receptor object-images but image modulation is normalised by central gain-control mechanisms that maintain excitatory and inhibitory balance, removing the contrast-ambiguity introduced by self-motion in much the same way that contrast gain-control is achieved in vision.  相似文献   

11.
12.
A fundamental tenet of visual science is that the detailed properties of visual systems are not capricious accidents, but are closely matched by evolution and neonatal experience to the environments and lifestyles in which those visual systems must work. This has been shown most convincingly for fish and insects. For mammalian vision, however, this tenet is based more upon theoretical arguments than upon direct observations. Here, we describe experiments that require human observers to discriminate between pictures of slightly different faces or objects. These are produced by a morphing technique that allows small, quantifiable changes to be made in the stimulus images. The independent variable is designed to give increasing deviation from natural visual scenes, and is a measure of the Fourier composition of the image (its second-order statistics). Performance in these tests was best when the pictures had natural second-order spatial statistics, and degraded when the images were made less natural. Furthermore, performance can be explained with a simple model of contrast coding, based upon the properties of simple cells in the mammalian visual cortex. The findings thus provide direct empirical support for the notion that human spatial vision is optimised to the second-order statistics of the optical environment.  相似文献   

13.
Shape from texture   总被引:4,自引:0,他引:4  
A central goal for visual perception is the recovery of the three-dimensional structure of the surfaces depicted in an image. Crucial information about three-dimensional structure is provided by the spatial distribution of surface markings, particularly for static monocular views: projection distorts texture geometry in a manner tha depends systematically on surface shape and orientation. To isolate and measure this projective distortion in an image is to recover the three dimensional structure of the textured surface. For natural textures, we show that the uniform density assumption (texels are uniformly distributed) is enough to recover the orientation of a single textured plane in view, under perspective projection. Furthermore, when the texels cannot be found, the edges of the image are enough to determine shape, under a more general assumption, that the sum of the lengths of the contours on the world plane is about the same everywhere. Finally, several experimental results for synthetic and natural images are presented.  相似文献   

14.
Amplitude contrast of images of weak phase and amplitude objects can be strengthened almost twice as compared to standard light-pole contrast by means of shadow method of image formation without special contrasting of the objects. In spite of the contour effects (electron-microscopic attenuation) appearing at image formation by asymmetrical shadow method, the quantitative interpretation of such image is quite possible. The symmetrical shadow image (image with cone illumination) is more complexly realized than the asymmetric one, but it has some additional advantages. Particularly efficient suppression of background noises is there possible. Several symmetrical shadow images can be synthetized by the differential method with or without colour coding into the final image with the signal/noise ratio increased by an order.  相似文献   

15.
乌恩  程静琦 《生物信息学》2019,26(10):54-59
在欧美、日本等国家及中国港台地区,环境解说已成为景观规划设计中的重要内容,如今,环境解说也已经引起了中国内地风景园林学界、业界的关注,但从对环境解说重要性的认知到规划、设计实践能力来看,环境解说还仅是中国景观规划设计表达中的一种“新语言”。从环境解说在景观规划设计中的重要性入手,结合实际案例阐述环境解说之于景观规划设计的重要性:环境解说系统能够使游客的景观理解更加深刻、丰富;对于保护地和国家公园、郊野公园等自然公园的规划设计,环境解说更是体现公园特色和个性的核心工作内容;按照传统风景观来看湿地等缺乏景观变化性、丰富性的自然空间,环境解说甚至可以成为公园游赏系统规划时的主要线索;环境解说具有辅助游客行为管理、提效资源环境保护的功能;优质的环境解说规划设计可以提升景观美感水平。  相似文献   

16.
Multiwavelength spectroscopy is a rapid analytical technique that can be applied to detect, identify, and quantify microorganisms such as Karenia brevis, the species known for frequent red-tide blooms in Florida's coastal waters. This research will report on a model-based interpretation of UV–vis spectra of K. brevis. The spectroscopy models are based on light scattering and absorption theories, and the approximation of the frequency-dependant optical properties of the basic constituents of living organisms. Absorption and scattering properties of K. brevis, such as cell size/shape, internal structure, and chemical composition, are shown to predict the spectral features observed in the measured spectra. The parameters for the interpretation model were based upon both reported literature values, and experimental values obtained from live cultures and pigment standards. Measured and mathematically derived spectra were compared to determine the adequacy of the model, contribute new spectral information, and to establish the proposed spectral interpretation approach as a new detection method for K. brevis.  相似文献   

17.
A brain-damaged patient (D.F.) with visual form agnosia is described and discussed. D.F. has a profound inability to recognize objects, places and people, in large part because of her inability to make perceptual discriminations of size, shape or orientation, despite having good visual acuity. Yet she is able to perform skilled actions that depend on that very same size, shape and orientation information that is missing from her perceptual awareness. It is suggested that her intact vision can best be understood within the framework of a dual processing model, according to which there are two cortical processing streams operating on different coding principles, for perception and for action, respectively. These may be expected to have different degrees of dependence on top-down information. One possibility is that D.F.''s lack of explicit awareness of the visual cues that guide her behaviour may result from her having to rely on a processing system which is not knowledge-based in a broad sense. Conversely, it may be that the perceptual system can provide conscious awareness of its products in normal individuals by virtue of the fact that it does interact with a stored base of visual knowledge.  相似文献   

18.
Panoramic image differences can be used for view-based homing under natural outdoor conditions, because they increase smoothly with distance from a reference location (Zeil et al., J Opt Soc Am A 20(3):450–469, 2003). The particular shape, slope and depth of such image difference functions (IDFs) recorded at any one place, however, depend on a number of factors that so far have only been qualitatively identified. Here we show how the shape of difference functions depends on the depth structure and the contrast of natural scenes, by quantifying the depth- distribution of different outdoor scenes and by comparing it to the difference functions calculated with differently processed panoramic images, which were recorded at the same locations. We find (1) that IDFs and catchment areas become systematically wider as the average distance of objects increases, (2) that simple image processing operations—like subtracting the local mean, difference-of-Gaussian filtering and local contrast normalization—make difference functions robust against changes in illumination and the spurious effects of shadows, and (3) by comparing depth-dependent translational and depth-independent rotational difference functions, we show that IDFs of contrast-normalized snapshots are predominantly determined by the depth-structure and possibly also by occluding contours in a scene. We propose a model for the shape of IDFs as a tool for quantitative comparisons between the shapes of these functions in different scenes.  相似文献   

19.
Sparse coding is a popular approach to model natural images but has faced two main challenges: modelling low-level image components (such as edge-like structures and their occlusions) and modelling varying pixel intensities. Traditionally, images are modelled as a sparse linear superposition of dictionary elements, where the probabilistic view of this problem is that the coefficients follow a Laplace or Cauchy prior distribution. We propose a novel model that instead uses a spike-and-slab prior and nonlinear combination of components. With the prior, our model can easily represent exact zeros for e.g. the absence of an image component, such as an edge, and a distribution over non-zero pixel intensities. With the nonlinearity (the nonlinear max combination rule), the idea is to target occlusions; dictionary elements correspond to image components that can occlude each other. There are major consequences of the model assumptions made by both (non)linear approaches, thus the main goal of this paper is to isolate and highlight differences between them. Parameter optimization is analytically and computationally intractable in our model, thus as a main contribution we design an exact Gibbs sampler for efficient inference which we can apply to higher dimensional data using latent variable preselection. Results on natural and artificial occlusion-rich data with controlled forms of sparse structure show that our model can extract a sparse set of edge-like components that closely match the generating process, which we refer to as interpretable components. Furthermore, the sparseness of the solution closely follows the ground-truth number of components/edges in the images. The linear model did not learn such edge-like components with any level of sparsity. This suggests that our model can adaptively well-approximate and characterize the meaningful generation process.  相似文献   

20.
Electric images of two low resistance objects in weakly electric fish   总被引:5,自引:0,他引:5  
Electroreceptive fish detect nearby objects by processing the information contained in the pattern of electric currents through their skin. In weakly electric fish, these currents arise from a self-generated field (the electric organ discharge), depending on the electrical properties of the surrounding medium. The electric image can be defined as the pattern of transepidermal voltage distributed over the receptive surface. To understand electrolocation it is necessary to know how electric image of objects are generated. In pulse mormyrids, the electric organ is localized at the tail, far from the receptors and fires a short biphasic pulse. Consequently, if all the elements in the environment are resistive, the stimulus at every point on the skin has the same waveform. Then, any measure of the amplitude (for example, the peak to peak amplitude) could be the unique parameter of the stimulus at any point of the skin. We have developed a model to calculate the image, corroborating that images are spread over the whole sensory surface and have an opposite center-surround, "Mexican-hat" shape. As a consequence, the images of different objects superimpose. We show theoretically and by simulation that the image of a pair of objects is not the simple addition of the individual images of these objects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号