首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Bayesian modeling of dynamic motion integration   总被引:1,自引:0,他引:1  
The quality of the representation of an object's motion is limited by the noise in the sensory input as well as by an intrinsic ambiguity due to the spatial limitation of the visual motion analyzers (aperture problem). Perceptual and oculomotor data demonstrate that motion processing of extended objects is initially dominated by the local 1D motion cues, related to the object's edges and orthogonal to them, whereas 2D information, related to terminators (or edge-endings), takes progressively over and leads to the final correct representation of global motion. A Bayesian framework accounting for the sensory noise and general expectancies for object velocities has proven successful in explaining several experimental findings concerning early motion processing [Weiss, Y., Adelson, E., 1998. Slow and smooth: a Bayesian theory for the combination of local motion signals in human vision. MIT Technical report, A.I. Memo 1624]. In particular, these models provide a qualitative account for the initial bias induced by the 1D motion cue. However, a complete functional model, encompassing the dynamical evolution of object motion perception, including the integration of different motion cues, is still lacking. Here we outline several experimental observations concerning human smooth pursuit of moving objects and more particularly the time course of its initiation phase, which reflects the ongoing motion integration process. In addition, we propose a recursive extension of the Bayesian model, motivated and constrained by our oculomotor data, to describe the dynamical integration of 1D and 2D motion information. We compare the model predictions for object motion tracking with human oculomotor recordings.  相似文献   

2.
Flexible representations of dynamics are used in object manipulation   总被引:1,自引:0,他引:1  
To manipulate an object skillfully, the brain must learn its dynamics, specifying the mapping between applied force and motion. A fundamental issue in sensorimotor control is whether such dynamics are represented in an extrinsic frame of reference tied to the object or an intrinsic frame of reference linked to the arm. Although previous studies have suggested that objects are represented in arm-centered coordinates [1-6], all of these studies have used objects with unusual and complex dynamics. Thus, it is not known how objects with natural dynamics are represented. Here we show that objects with simple (or familiar) dynamics and those with complex (or unfamiliar) dynamics are represented in object- and arm-centered coordinates, respectively. We also show that objects with simple dynamics are represented with an intermediate coordinate frame when vision of the object is removed. These results indicate that object dynamics can be flexibly represented in different coordinate frames by the brain. We suggest that with experience, the representation of the dynamics of a manipulated object may shift from a coordinate frame tied to the arm toward one that is linked to the object. The additional complexity required to represent dynamics in object-centered coordinates would be economical for familiar objects because such a representation allows object use regardless of the orientation of the object in hand.  相似文献   

3.
This article deals with the role of fish's body and object's geometry on determining the image spatial shape in pulse Gymnotiforms. This problem was explored by measuring local electric fields along a line on the skin in the presence and absence of objects. We depicted object's electric images at different regions of the electrosensory mosaic, paying particular attention to the perioral region where a fovea has been described. When sensory surface curvature increases relative to the object's curvature, the image details depending on object's shape are blurred and finally disappear. The remaining effect of the object on the stimulus profile depends on the strength of its global polarization. This depends on the length of the object's axis aligned with the field, in turn depending on fish body geometry. Thus, fish's body and self-generated electric field geometries are embodied in this "global effect" of the object. The presence of edges or local changes in impedance at the nearest surface of closely located objects adds peaks to the image profiles ("local effect" or "object's electric texture"). It is concluded that two cues for object recognition may be used by active electroreceptive animals: global effects (informing on object's dimension along the field lines, conductance, and position) and local effects (informing on object's surface). Since the field has fish's centered coordinates, and electrosensory fovea is used for exploration of surfaces, fish fine movements are essential to perform electric perception. We conclude that fish may explore adjacent objects combining active movements and electrogenesis to represent them using electrosensory information.  相似文献   

4.
Studies have shown that internal representations of manipulations of objects with asymmetric mass distributions that are generated within a specific orientation are not generalizable to novel orientations, i.e., subjects fail to prevent object roll on their first grasp-lift attempt of the object following 180° object rotation. This suggests that representations of these manipulations are specific to the reference frame in which they are formed. However, it is unknown whether that reference frame is specific to the hand, the body, or both, because rotating the object 180° modifies the relation between object and body as well as object and hand. An alternative, untested explanation for the above failure to generalize learned manipulations is that any rotation will disrupt grasp performance, regardless if the reference frame in which the manipulation was learned is maintained or modified. We examined the effect of rotations that (1) maintain and (2) modify relations between object and body, and object and hand, on the generalizability of learned two-digit manipulation of an object with an asymmetric mass distribution. Following rotations that maintained the relation between object and body and object and hand (e.g., rotating the object and subject 180°), subjects continued to use appropriate digit placement and load force distributions, thus generating sufficient compensatory moments to minimize object roll. In contrast, following rotations that modified the relation between (1) object and hand (e.g. rotating the hand around to the opposite object side), (2) object and body (e.g. rotating subject and hand 180°), or (3) both (e.g. rotating the subject 180°), subjects used the same, yet inappropriate digit placement and load force distribution, as those used prior to the rotation. Consequently, the compensatory moments were insufficient to prevent large object rolls. These findings suggest that representations of learned manipulation of objects with asymmetric mass distributions are specific to the body- and hand-reference frames in which they were learned.  相似文献   

5.
Recognizing depth-rotated objects: a review of recent research and theory   总被引:1,自引:0,他引:1  
Biederman I 《Spatial Vision》2000,13(2-3):241-253
  相似文献   

6.
Bayesian multisensory integration and cross-modal spatial links.   总被引:2,自引:0,他引:2  
Our perception of the word is the result of combining information between several senses, such as vision, audition and proprioception. These sensory modalities use widely different frames of reference to represent the properties and locations of object. Moreover, multisensory cues come with different degrees of reliability, and the reliability of a given cue can change in different contexts. The Bayesian framework--which we describe in this review--provides an optimal solution to deal with this issue of combining cues that are not equally reliable. However, this approach does not address the issue of frames of references. We show that this problem can be solved by creating cross-modal spatial links in basis function networks. Finally, we show how the basis function approach can be combined with the Bayesian framework to yield networks that can perform optimal multisensory combination. On the basis of this theory, we argue that multisensory integration is a dialogue between sensory modalities rather that the convergence of all sensory information onto a supra-modal area.  相似文献   

7.
Four experiments with human subjects examined the cue-interaction effects using a computer-controlled predictive learning task. In Phase 1, subjects learned that cue P was consistently associated with the occurrence of an outcome (P+), whereas cue N was never followed by the outcome (N−). In Phase 2, two neutral cues, R and I, were compounded with P and N, respectively. Each compound was followed by the outcome (PR+ and NI+). Thus, cue R was compounded with the already predictive cue P, whereas cue I was compounded with the non-predictive cue N. In each phase, subjects rated the contingency between the different cues and the outcome. In experiments 1 and 2, the spatial position of the cues was fixed, whereas it was variable in experiments 3, 4a and 4b. Verbal cues were used in experiments 1–3, whereas the cues consisted of geometrical figures in experiments 4a and 4b. Evidence for cue interaction, as indicated by giving cue I a higher contingency rating than cue R after or during Phase 2, was only found under the conditions of experiments 1 and 2. The results indicate that the use of positional cues facilitates the occurrence of cue-interaction effects. Possible reasons for this finding are discussed.  相似文献   

8.

Background

Optic flow is an important cue for object detection. Humans are able to perceive objects in a scene using only kinetic boundaries, and can perform the task even when other shape cues are not provided. These kinetic boundaries are characterized by the presence of motion discontinuities in a local neighbourhood. In addition, temporal occlusions appear along the boundaries as the object in front covers the background and the objects that are spatially behind it.

Methodology/Principal Findings

From a technical point of view, the detection of motion boundaries for segmentation based on optic flow is a difficult task. This is due to the problem that flow detected along such boundaries is generally not reliable. We propose a model derived from mechanisms found in visual areas V1, MT, and MSTl of human and primate cortex that achieves robust detection along motion boundaries. It includes two separate mechanisms for both the detection of motion discontinuities and of occlusion regions based on how neurons respond to spatial and temporal contrast, respectively. The mechanisms are embedded in a biologically inspired architecture that integrates information of different model components of the visual processing due to feedback connections. In particular, mutual interactions between the detection of motion discontinuities and temporal occlusions allow a considerable improvement of the kinetic boundary detection.

Conclusions/Significance

A new model is proposed that uses optic flow cues to detect motion discontinuities and object occlusion. We suggest that by combining these results for motion discontinuities and object occlusion, object segmentation within the model can be improved. This idea could also be applied in other models for object segmentation. In addition, we discuss how this model is related to neurophysiological findings. The model was successfully tested both with artificial and real sequences including self and object motion.  相似文献   

9.
The central problems of vision are often divided into object identification and localization. Object identification, at least at fine levels of discrimination, may require the application of top-down knowledge to resolve ambiguous image information. Utilizing top-down knowledge, however, may require the initial rapid access of abstract object categories based on low-level image cues. Does object localization require a different set of operating principles than object identification or is category determination also part of the perception of depth and spatial layout? Three-dimensional graphics movies of objects and their cast shadows are used to argue that identifying perceptual categories is important for determining the relative depths of objects. Processes that can identify the causal class (e.g. the kind of material) that generates the image data can provide information to determine the spatial relationships between surfaces. Changes in the blurriness of an edge may be characteristically associated with shadows caused by relative motion between two surfaces. The early identification of abstract events such as moving object/shadow pairs may also be important for depth from shadows. Knowledge of how correlated motion in the image relates to an object and its shadow may provide a reliable cue to access such event categories.  相似文献   

10.

Background

A key aspect of representations for object recognition and scene analysis in the ventral visual stream is the spatial frame of reference, be it a viewer-centered, object-centered, or scene-based coordinate system. Coordinate transforms from retinocentric space to other reference frames involve combining neural visual responses with extraretinal postural information.

Methodology/Principal Findings

We examined whether such spatial information is available to anterior inferotemporal (AIT) neurons in the macaque monkey by measuring the effect of eye position on responses to a set of simple 2D shapes. We report, for the first time, a significant eye position effect in over 40% of recorded neurons with small gaze angle shifts from central fixation. Although eye position modulates responses, it does not change shape selectivity.

Conclusions/Significance

These data demonstrate that spatial information is available in AIT for the representation of objects and scenes within a non-retinocentric frame of reference. More generally, the availability of spatial information in AIT calls into questions the classic dichotomy in visual processing that associates object shape processing with ventral structures such as AIT but places spatial processing in a separate anatomical stream projecting to dorsal structures.  相似文献   

11.

Background

Barn owls integrate spatial information across frequency channels to localize sounds in space.

Methodology/Principal Findings

We presented barn owls with synchronous sounds that contained different bands of frequencies (3–5 kHz and 7–9 kHz) from different locations in space. When the owls were confronted with the conflicting localization cues from two synchronous sounds of equal level, their orienting responses were dominated by one of the sounds: they oriented toward the location of the low frequency sound when the sources were separated in azimuth; in contrast, they oriented toward the location of the high frequency sound when the sources were separated in elevation. We identified neural correlates of this behavioral effect in the optic tectum (OT, superior colliculus in mammals), which contains a map of auditory space and is involved in generating orienting movements to sounds. We found that low frequency cues dominate the representation of sound azimuth in the OT space map, whereas high frequency cues dominate the representation of sound elevation.

Conclusions/Significance

We argue that the dominance hierarchy of localization cues reflects several factors: 1) the relative amplitude of the sound providing the cue, 2) the resolution with which the auditory system measures the value of a cue, and 3) the spatial ambiguity in interpreting the cue. These same factors may contribute to the relative weighting of sound localization cues in other species, including humans.  相似文献   

12.
1. Olfactory predator search processes differ fundamentally to those based on vision, particularly when odour cues are deposited rather than airborne or emanating from a point source. When searching for visually cryptic prey that may have moved some distance from a deposited odour cue, cue context and spatial variability are the most likely sources of information about prey location available to an olfactory predator. 2. We tested whether the house mouse (Mus domesticus), a model olfactory predator, would use cue context and spatial variability when searching for buried food items; specifically, we tested the effect of varying cue patchiness, odour strength, and cue-prey association on mouse foraging success. 3. Within mouse- and predator-proof enclosures, we created grids of 100 sand-filled Petri dishes and buried peanut pieces in a set number of these patches to represent visually cryptic 'prey'. By adding peanut oil to selected dishes, we varied the spatial distribution of prey odour relative to the distribution of prey patches in each grid, to reflect different levels of cue patchiness (Experiment 1), odour strength (Experiment 2) and cue-prey association (Experiment 3). We measured the overnight foraging success of individual mice (percentage of searched patches containing prey), as well as their foraging activity (percentage of patches searched), and prey survival (percentage of unsearched prey patches). 4. Mouse foraging success was highest where odour cues were patchy rather than uniform (Experiment 1), and where cues were tightly associated with prey location, rather than randomly or uniformly distributed (Experiment 3). However, when cues at prey patches were ten times stronger than a uniformly distributed weak background odour, mice did not improve their foraging success over that experienced when cues were of uniform strength and distribution (Experiment 2). 5. These results suggest that spatial variability and cue context are important means by which olfactory predators can use deposited odour cues to locate visually cryptic prey. They also indicate that chemical crypsis can disrupt these search processes as effectively as background matching in visually based predator-prey systems.  相似文献   

13.

Background

Can hearing a word change what one sees? Although visual sensitivity is known to be enhanced by attending to the location of the target, perceptual enhancements of following cues to the identity of an object have been difficult to find. Here, we show that perceptual sensitivity is enhanced by verbal, but not visual cues.

Methodology/Principal Findings

Participants completed an object detection task in which they made an object-presence or -absence decision to briefly-presented letters. Hearing the letter name prior to the detection task increased perceptual sensitivity (d′). A visual cue in the form of a preview of the to-be-detected letter did not. Follow-up experiments found that the auditory cuing effect was specific to validly cued stimuli. The magnitude of the cuing effect positively correlated with an individual measure of vividness of mental imagery; introducing uncertainty into the position of the stimulus did not reduce the magnitude of the cuing effect, but eliminated the correlation with mental imagery.

Conclusions/Significance

Hearing a word made otherwise invisible objects visible. Interestingly, seeing a preview of the target stimulus did not similarly enhance detection of the target. These results are compatible with an account in which auditory verbal labels modulate lower-level visual processing. The findings show that a verbal cue in the form of hearing a word can influence even the most elementary visual processing and inform our understanding of how language affects perception.  相似文献   

14.
Detecting agents     
This paper reviews a recent set of behavioural studies that examine the scope and nature of the representational system underlying theory-of-mind development. Studies with typically developing infants, adults and children with autism all converge on the claim that there is a specialized input system that uses not only morphological cues, but also behavioural cues to categorize novel objects as agents. Evidence is reviewed in which 12- to 15-month-old infants treat certain non-human objects as if they have perceptual/attentional abilities, communicative abilities and goal-directed behaviour. They will follow the attentional orientation of an amorphously shaped novel object if it interacts contingently with them or with another person. They also seem to use a novel object's environmentally directed behaviour to determine its perceptual/attentional orientation and object-oriented goals. Results from adults and children with autism are strikingly similar, despite adults' contradictory beliefs about the objects in question and the failure of children with autism to ultimately develop more advanced theory-of-mind reasoning. The implications for a general theory-of-mind development are discussed.  相似文献   

15.
Visually recognizing objects at different orientations and distances has been assumed to depend either on extracting from the retinal image a viewpoint-invariant, typically three-dimensional (3D) structure, such as object parts, or on mentally transforming two-dimensional (2D) views. To test how these processes might interact with each other, an experiment was performed in which observers discriminated images of novel, computer-generated, 3D objects, differing by rotations in 3D space and in the number of parts (in principle, a viewpoint-invariant, 'non-accidental' property) or in the curvature, length or angle of join of their parts (in principle, each a viewpoint-dependent, metric property), such that the discriminatory cue varied along a common physical scale. Although differences in the number of parts were more readily discriminated than differences in metric properties, they showed almost exactly the same orientation dependence. Overall, visual performance proved remarkably lawful: for both long (2 s) and short (100 ms) display durations, it could be summarized by a simple, compact equation with one term representing generalized viewpoint-invariant parts-based processing of 3D object structure, including metric structure, and another term representing structure-invariant processing of 2D views. Object discriminability was determined by summing signals from these two independent processes.  相似文献   

16.
The aim of this study was to verify the contribution of haptic and auditory cues in the quick discrimination of an object mass. Ten subjects had to brake with the right hand the movement of a cup due to the falling impact of an object that could be of two different masses. They were asked to perform a quick left hand movement if the object was of the prescribed mass according to the proprioceptive and auditory cues they received from object contact with the cup and did not react to the other object. Three conditions were established: with both proprioceptive and auditory cues, only with proprioceptive cue or only with an auditory cue. When proprioceptive information was available subjects advanced responses time to the impact of the heavy object as compared with that of the light object. The addition of an auditory cue did not improve the advancement for the heavy object. We conclude that when a motor response has to be chosen according to different combinations of auditory and proprioceptive load-related information, subjects used mainly haptic information to fast respond and that auditory cues do not add relevant information that could ameliorate the quickness of a correct response.  相似文献   

17.
How the brain combines information from different sensory modalities and of differing reliability is an important and still-unanswered question. Using the head direction (HD) system as a model, we explored the resolution of conflicts between landmarks and background cues. Sensory cue integration models predict averaging of the two cues, whereas attractor models predict capture of the signal by the dominant cue. We found that a visual landmark mostly captured the HD signal at low conflicts: however, there was an increasing propensity for the cells to integrate the cues thereafter. A large conflict presented to naive rats resulted in greater visual cue capture (less integration) than in experienced rats, revealing an effect of experience. We propose that weighted cue integration in HD cells arises from dynamic plasticity of the feed-forward inputs to the network, causing within-trial spatial redistribution of the visual inputs onto the ring. This suggests that an attractor network can implement decision processes about cue reliability using simple architecture and learning rules, thus providing a potential neural substrate for weighted cue integration.  相似文献   

18.
The Visual Association Test (VAT) is a brief learning task that consists of six line drawings of pairs of interacting objects (association cards). Subjects are asked to name or identify each object and later are presented with one object from the pair (the cue) and asked to name the other (the target). The VAT was administered in a consecutive sample of 174 psychogeriatric day care participants with mild to major neurocognitive disorder. Comparison of test performance with normative data from non-demented subjects revealed that 69% scored within the range of a major deficit (0–8 over two recall trials), 14% a minor, and 17% no deficit (9–10, and ≥10 respectively).VAT-scores correlated with another test of memory function, the Cognitive Screening Test (CST), based on the Short Portable Mental Status Questionnaire (r = 0.53). Tests of executive functioning (Expanded Mental Control Test, Category Fluency, Clock Drawing) did not add significantly to the explanation of variance in VAT-scores.Fifty-five participants (31.6%) were faced with initial problems in naming or identifying one or more objects on the cue cards or association cards. If necessary, naming was aided by the investigator. Initial difficulties in identifying cue objects were associated with lower VAT-scores, but this did not hold for difficulties in identifying target objects.A hierarchical multiple regression analysis was used to examine whether linear or quadratic trends best fitted VAT performance across the range of CST scores. The regression model revealed a linear but not a quadratic trend. The best fitting linear model implied that VAT scores differentiated between CST scores in the lower, as well as in the upper range, indicating the absence of floor and ceiling effects, respectively. Moreover, the VAT compares favourably to word list-learning tasks being more attractive in its presentation of interacting visual objects and cued recall based on incidental learning of the association between cues and targets.For practical purposes and based on documented sensitivity and specificity, Bayesian probability tables give predictive power of age-specific VAT cutoff scores for the presence or absence of a major neurocognitive disorder across a range of a priori probabilities or base rates.  相似文献   

19.

Background

How do people sustain a visual representation of the environment? Currently, many researchers argue that a single visual working memory system sustains non-spatial object information such as colors and shapes. However, previous studies tested visual working memory for two-dimensional objects only. In consequence, the nature of visual working memory for three-dimensional (3D) object representation remains unknown.

Methodology/Principal Findings

Here, I show that when sustaining information about 3D objects, visual working memory clearly divides into two separate, specialized memory systems, rather than one system, as was previously thought. One memory system gradually accumulates sensory information, forming an increasingly precise view-dependent representation of the scene over the course of several seconds. A second memory system sustains view-invariant representations of 3D objects. The view-dependent memory system has a storage capacity of 3–4 representations and the view-invariant memory system has a storage capacity of 1–2 representations. These systems can operate independently from one another and do not compete for working memory storage resources.

Conclusions/Significance

These results provide evidence that visual working memory sustains object information in two separate, specialized memory systems. One memory system sustains view-dependent representations of the scene, akin to the view-specific representations that guide place recognition during navigation in humans, rodents and insects. The second memory system sustains view-invariant representations of 3D objects, akin to the object-based representations that underlie object cognition.  相似文献   

20.
The human visual system utilizes depth information as a major cue to group together visual items constituting an object and to segregate them from items belonging to other objects in the visual scene. Depth information can be inferred from a variety of different visual cues, such as disparity, occlusions and perspective. Many of these cues provide only local and relative information about the depth of objects. For example, at occlusions, T-junctions indicate the local relative depth precedence of surface patches. However, in order to obtain a globally consistent interpretation of the depth relations between the surfaces and objects in a visual scene, a mechanism is necessary that globally propagates such local and relative information. We present a computational framework in which depth information derived from T-junctions is propagated along surface contours using local recurrent interactions between neighboring neurons. We demonstrate that within this framework a globally consistent depth sorting of overlapping surfaces can be obtained on the basis of local interactions. Unlike previous approaches in which locally restricted cell interactions could merely distinguish between two depths (figure and ground), our model can also represent several intermediate depth positions. Our approach is an extension of a previous model of recurrent V1–V2 interaction for contour processing and illusory contour formation. Based on the contour representation created by this model, a recursive scheme of local interactions subsequently achieves a globally consistent depth sorting of several overlapping surfaces. Within this framework, the induction of illusory contours by the model of recurrent V1–V2 interaction gives rise to the figure-ground segmentation of illusory figures such as a Kanizsa square.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号