首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The present study investigated the influence of an auditory tone on the localization of visual objects in the stream/bounce display (SBD). In this display, two identical visual objects move toward each other, overlap, and then return to their original positions. These objects can be perceived as either streaming through or bouncing off each other. In this study, the closest distance between object centers on opposing trajectories and tone presentation timing (none, 0 ms, ± 90 ms, and ± 390 ms relative to the instant for the closest distance) were manipulated. Observers were asked to judge whether the two objects overlapped with each other and whether the objects appeared to stream through, bounce off each other, or reverse their direction of motion. A tone presented at or around the instant of the objects’ closest distance biased judgments toward “non-overlapping,” and observers overestimated the physical distance between objects. A similar bias toward direction change judgments (bounce and reverse, not stream judgments) was also observed, which was always stronger than the non-overlapping bias. Thus, these two types of judgments were not always identical. Moreover, another experiment showed that it was unlikely that this observed mislocalization could be explained by other previously known mislocalization phenomena (i.e., representational momentum, the Fröhlich effect, and a turn-point shift). These findings indicate a new example of crossmodal mislocalization, which can be obtained without temporal offsets between audiovisual stimuli. The mislocalization effect is also specific to a more complex stimulus configuration of objects on opposing trajectories, with a tone that is presented simultaneously. The present study promotes an understanding of relatively complex audiovisual interactions beyond simple one-to-one audiovisual stimuli used in previous studies.  相似文献   

2.
Audiovisual integration of letters in the human brain   总被引:5,自引:0,他引:5  
Raij T  Uutela K  Hari R 《Neuron》2000,28(2):617-625
Letters of the alphabet have auditory (phonemic) and visual (graphemic) qualities. To investigate the neural representations of such audiovisual objects, we recorded neuromagnetic cortical responses to auditorily, visually, and audiovisually presented single letters. The auditory and visual brain activations first converged around 225 ms after stimulus onset and then interacted predominantly in the right temporo-occipito-parietal junction (280345 ms) and the left (380-540 ms) and right (450-535 ms) superior temporal sulci. These multisensory brain areas, playing a role in audiovisual integration of phonemes and graphemes, participate in the neural network supporting the supramodal concept of a "letter." The dynamics of these functions bring new insight into the interplay between sensory and association cortices during object recognition.  相似文献   

3.
Auditory streaming and visual plaids have been used extensively to study perceptual organization in each modality. Both stimuli can produce bistable alternations between grouped (one object) and split (two objects) interpretations. They also share two peculiar features: (i) at the onset of stimulus presentation, organization starts with a systematic bias towards the grouped interpretation; (ii) this first percept has 'inertia'; it lasts longer than the subsequent ones. As a result, the probability of forming different objects builds up over time, a landmark of both behavioural and neurophysiological data on auditory streaming. Here we show that first percept bias and inertia are independent. In plaid perception, inertia is due to a depth ordering ambiguity in the transparent (split) interpretation that makes plaid perception tristable rather than bistable: experimental manipulations removing the depth ambiguity suppressed inertia. However, the first percept bias persisted. We attempted a similar manipulation for auditory streaming by introducing level differences between streams, to bias which stream would appear in the perceptual foreground. Here both inertia and first percept bias persisted. We thus argue that the critical common feature of the onset of perceptual organization is the grouping bias, which may be related to the transition from temporally/spatially local to temporally/spatially global computation.  相似文献   

4.

Background

The timing at which sensory input reaches the level of conscious perception is an intriguing question still awaiting an answer. It is often assumed that both visual and auditory percepts have a modality specific processing delay and their difference determines perceptual temporal offset.

Methodology/Principal Findings

Here, we show that the perception of audiovisual simultaneity can change flexibly and fluctuates over a short period of time while subjects observe a constant stimulus. We investigated the mechanisms underlying the spontaneous alternations in this audiovisual illusion and found that attention plays a crucial role. When attention was distracted from the stimulus, the perceptual transitions disappeared. When attention was directed to a visual event, the perceived timing of an auditory event was attracted towards that event.

Conclusions/Significance

This multistable display illustrates how flexible perceived timing can be, and at the same time offers a paradigm to dissociate perceptual from stimulus-driven factors in crossmodal feature binding. Our findings suggest that the perception of crossmodal synchrony depends on perceptual binding of audiovisual stimuli as a common event.  相似文献   

5.
Hasson U  Skipper JI  Nusbaum HC  Small SL 《Neuron》2007,56(6):1116-1126
Is there a neural representation of speech that transcends its sensory properties? Using fMRI, we investigated whether there are brain areas where neural activity during observation of sublexical audiovisual input corresponds to a listener's speech percept (what is "heard") independent of the sensory properties of the input. A target audiovisual stimulus was preceded by stimuli that (1) shared the target's auditory features (auditory overlap), (2) shared the target's visual features (visual overlap), or (3) shared neither the target's auditory or visual features but were perceived as the target (perceptual overlap). In two left-hemisphere regions (pars opercularis, planum polare), the target invoked less activity when it was preceded by the perceptually overlapping stimulus than when preceded by stimuli that shared one of its sensory components. This pattern of neural facilitation indicates that these regions code sublexical speech at an abstract level corresponding to that of the speech percept.  相似文献   

6.
This article aims to investigate whether auditory stimuli in the horizontal plane, particularly originating from behind the participant, affect audiovisual integration by using behavioral and event-related potential (ERP) measurements. In this study, visual stimuli were presented directly in front of the participants, auditory stimuli were presented at one location in an equidistant horizontal plane at the front (0°, the fixation point), right (90°), back (180°), or left (270°) of the participants, and audiovisual stimuli that include both visual stimuli and auditory stimuli originating from one of the four locations were simultaneously presented. These stimuli were presented randomly with equal probability; during this time, participants were asked to attend to the visual stimulus and respond promptly only to visual target stimuli (a unimodal visual target stimulus and the visual target of the audiovisual stimulus). A significant facilitation of reaction times and hit rates was obtained following audiovisual stimulation, irrespective of whether the auditory stimuli were presented in the front or back of the participant. However, no significant interactions were found between visual stimuli and auditory stimuli from the right or left. Two main ERP components related to audiovisual integration were found: first, auditory stimuli from the front location produced an ERP reaction over the right temporal area and right occipital area at approximately 160–200 milliseconds; second, auditory stimuli from the back produced a reaction over the parietal and occipital areas at approximately 360–400 milliseconds. Our results confirmed that audiovisual integration was also elicited, even though auditory stimuli were presented behind the participant, but no integration occurred when auditory stimuli were presented in the right or left spaces, suggesting that the human brain might be particularly sensitive to information received from behind than both sides.  相似文献   

7.
Beauchamp MS  Lee KE  Argall BD  Martin A 《Neuron》2004,41(5):809-823
Two categories of objects in the environment-animals and man-made manipulable objects (tools)-are easily recognized by either their auditory or visual features. Although these features differ across modalities, the brain integrates them into a coherent percept. In three separate fMRI experiments, posterior superior temporal sulcus and middle temporal gyrus (pSTS/MTG) fulfilled objective criteria for an integration site. pSTS/MTG showed signal increases in response to either auditory or visual stimuli and responded more to auditory or visual objects than to meaningless (but complex) control stimuli. pSTS/MTG showed an enhanced response when auditory and visual object features were presented together, relative to presentation in a single modality. Finally, pSTS/MTG responded more to object identification than to other components of the behavioral task. We suggest that pSTS/MTG is specialized for integrating different types of information both within modalities (e.g., visual form, visual motion) and across modalities (auditory and visual).  相似文献   

8.
When dealing with natural scenes, sensory systems have to process an often messy and ambiguous flow of information. A stable perceptual organization nevertheless has to be achieved in order to guide behavior. The neural mechanisms involved can be highlighted by intrinsically ambiguous situations. In such cases, bistable perception occurs: distinct interpretations of the unchanging stimulus alternate spontaneously in the mind of the observer. Bistable stimuli have been used extensively for more than two centuries to study visual perception. Here we demonstrate that bistable perception also occurs in the auditory modality. We compared the temporal dynamics of percept alternations observed during auditory streaming with those observed for visual plaids and the susceptibilities of both modalities to volitional control. Strong similarities indicate that auditory and visual alternations share common principles of perceptual bistability. The absence of correlation across modalities for subject-specific biases, however, suggests that these common principles are implemented at least partly independently across sensory modalities. We propose that visual and auditory perceptual organization could rely on distributed but functionally similar neural competition mechanisms aimed at resolving sensory ambiguities.  相似文献   

9.
Research on the neural basis of speech-reading implicates a network of auditory language regions involving inferior frontal cortex, premotor cortex and sites along superior temporal cortex. In audiovisual speech studies, neural activity is consistently reported in posterior superior temporal Sulcus (pSTS) and this site has been implicated in multimodal integration. Traditionally, multisensory interactions are considered high-level processing that engages heteromodal association cortices (such as STS). Recent work, however, challenges this notion and suggests that multisensory interactions may occur in low-level unimodal sensory cortices. While previous audiovisual speech studies demonstrate that high-level multisensory interactions occur in pSTS, what remains unclear is how early in the processing hierarchy these multisensory interactions may occur. The goal of the present fMRI experiment is to investigate how visual speech can influence activity in auditory cortex above and beyond its response to auditory speech. In an audiovisual speech experiment, subjects were presented with auditory speech with and without congruent visual input. Holding the auditory stimulus constant across the experiment, we investigated how the addition of visual speech influences activity in auditory cortex. We demonstrate that congruent visual speech increases the activity in auditory cortex.  相似文献   

10.
In 1958 MacKay showed that a rigidly moving object becomes visually fragmented when part of it is continuously visible but the rest is illuminated intermittently. For example, the glowing tip of a lit cigarette moving under stroboscopic illumination appeared to move ahead of the intermittently lit body. Latterly rediscovered as "the flash-lag effect" (FLE), this illusion now is typically demonstrated on a computer monitor showing two spots of light, one translating across the screen and another briefly flashed in vertical alignment with it. Despite being physically aligned, the brief flash is seen to lag behind the moving spot. This effect has recently motivated much fruitful research, prompting a variety of potential explanations, including those based on motion extrapolation, differential latency, attention, postdiction, and temporal integration (for review, see ). With no consensus on which theory is most plausible, we have broadened the scope of enquiry to include audition and have found that the FLE is not confined to vision. Whether the auditory motion stimulus is a frequency sweep or a translating sound source, briefly presented auditory stimuli lag behind auditory movement. In addition, when we used spatial motion, we found that the FLE can occur cross-modally. Together, these findings challenge several FLE theories and point to a discrepancy between internal brain timing and external stimulus timing.  相似文献   

11.
A combination of signals across modalities can facilitate sensory perception. The audiovisual facilitative effect strongly depends on the features of the stimulus. Here, we investigated how sound frequency, which is one of basic features of an auditory signal, modulates audiovisual integration. In this study, the task of the participant was to respond to a visual target stimulus by pressing a key while ignoring auditory stimuli, comprising of tones of different frequencies (0.5, 1, 2.5 and 5 kHz). A significant facilitation of reaction times was obtained following audiovisual stimulation, irrespective of whether the task-irrelevant sounds were low or high frequency. Using event-related potential (ERP), audiovisual integration was found over the occipital area for 0.5 kHz auditory stimuli from 190–210 ms, for 1 kHz stimuli from 170–200 ms, for 2.5 kHz stimuli from 140–200 ms, 5 kHz stimuli from 100–200 ms. These findings suggest that a higher frequency sound signal paired with visual stimuli might be early processed or integrated despite the auditory stimuli being task-irrelevant information. Furthermore, audiovisual integration in late latency (300–340 ms) ERPs with fronto-central topography was found for auditory stimuli of lower frequencies (0.5, 1 and 2.5 kHz). Our results confirmed that audiovisual integration is affected by the frequency of an auditory stimulus. Taken together, the neurophysiological results provide unique insight into how the brain processes a multisensory visual signal and auditory stimuli of different frequencies.  相似文献   

12.
BACKGROUND: In anorthoscopic viewing conditions, observers can perceive a moving object through a narrow slit even when only portions of its contour are visible at any time. We used fMRI to examine the contribution of early and later visual cortical areas to dynamic shape integration. Observers' success at integrating the shape of the slit-viewed object was manipulated by varying the degree to which the stimulus was dynamically distorted. Line drawings of common objects were either moderately distorted, strongly distorted, or shown undistorted. Phenomenologically, increasing the stimulus distortion made both object shape and motion more difficult to perceive.RESULTS: We found that bilateral cortical activity in portions of the ventral occipital cortex, corresponding to known object areas within the lateral occipital complex (LOC), was inversely correlated with the degree of stimulus distortion. We found that activity in left MT+, the human cortical area specialized for motion, showed a similar pattern as the ventral occipital region. The LOC also showed greater activity to a fully visible moving object than to the undistorted slit-viewed object. Area MT+, however, showed more equivalent activity to both the slit-viewed and fully visible moving objects.CONCLUSIONS: In early retinotopic cortex, the distorted and undistorted stimuli elicited the same amount of activity. Higher visual areas, however, were correlated with the percept of the coherent object, and this correlation suggests that the shape integration is mediated by later visual cortical areas. Motion information from the dorsal stream may project to the LOC to produce the shape percept.  相似文献   

13.
Given that both auditory and visual systems have anatomically separate object identification ("what") and spatial ("where") pathways, it is of interest whether attention-driven cross-sensory modulations occur separately within these feature domains. Here, we investigated how auditory "what" vs. "where" attention tasks modulate activity in visual pathways using cortically constrained source estimates of magnetoencephalograpic (MEG) oscillatory activity. In the absence of visual stimuli or tasks, subjects were presented with a sequence of auditory-stimulus pairs and instructed to selectively attend to phonetic ("what") vs. spatial ("where") aspects of these sounds, or to listen passively. To investigate sustained modulatory effects, oscillatory power was estimated from time periods between sound-pair presentations. In comparison to attention to sound locations, phonetic auditory attention was associated with stronger alpha (7-13 Hz) power in several visual areas (primary visual cortex; lingual, fusiform, and inferior temporal gyri, lateral occipital cortex), as well as in higher-order visual/multisensory areas including lateral/medial parietal and retrosplenial cortices. Region-of-interest (ROI) analyses of dynamic changes, from which the sustained effects had been removed, suggested further power increases during Attend Phoneme vs. Location centered at the alpha range 400-600 ms after the onset of second sound of each stimulus pair. These results suggest distinct modulations of visual system oscillatory activity during auditory attention to sound object identity ("what") vs. sound location ("where"). The alpha modulations could be interpreted to reflect enhanced crossmodal inhibition of feature-specific visual pathways and adjacent audiovisual association areas during "what" vs. "where" auditory attention.  相似文献   

14.

Background

Visual perception is usually stable and accurate. However, when the two eyes are simultaneously presented with conflicting stimuli, perception falls into a sequence of spontaneous alternations, switching between one stimulus and the other every few seconds. Known as binocular rivalry, this visual illusion decouples subjective experience from physical stimulation and provides a unique opportunity to study the neural correlates of consciousness. The temporal properties of this alternating perception have been intensively investigated for decades, yet the relationship between two fundamental properties - the sequence of percepts and the duration of each percept - remains largely unexplored.

Methodology/Principal Findings

Here we examine the relationship between the percept sequence and the percept duration by quantifying their sensitivity to the strength imbalance between two monocular stimuli. We found that the percept sequence is far more susceptible to the stimulus imbalance than does the percept duration. The percept sequence always begins with the stronger stimulus, even when the stimulus imbalance is too weak to cause a significant bias in the percept duration. Therefore, introducing a small stimulus imbalance affects the percept sequence, whereas increasing the imbalance affects the percept duration, but not vice versa. To investigate why the percept sequence is so vulnerable to the stimulus imbalance, we further measured the interval between the stimulus onset and the first percept, during which subjects experienced the fusion of two monocular stimuli. We found that this interval is dramatically shortened with increased stimulus imbalance.

Conclusions/Significance

Our study shows that in binocular rivalry, the strength imblanace between monocular stimuli has a much greater impact on the percept sequence than on the percept duration, and increasing this imbalance can accelerate the process responsible for the percept sequence.  相似文献   

15.
Visible persistence refers to the continuation of visual perception after the physical termination of a stimulus. We studied an extreme case of visible persistence by presenting two matrices of randomly distributed black and white pixels in succession. On the transition from one matrix to the second, the luminance polarity of all pixels within a disk- or annulus-shaped area reversed, physically creating a single second-order transient signal. This transient signal produces the percept of a disk or an annulus with an abrupt onset and a gradual offset. To study the nature of this fading percept we varied spatial parameters, such as the inner and the outer diameter of annuli (Experiment I) and the radius and eccentricity of disks (Experiment III), and measured the duration of visible persistence by having subjects adjust the synchrony of the onset of a reference stimulus with the onset or the offset of the fading percept. We validated this method by comparing two modalities of the reference stimuli (Experiment I) and by comparing the judgments of fading percepts with the judgments of stimuli that actually fade in luminance contrast (Experiment II). The results show that (i) irrespective of the reference modality, participants are able to precisely judge the on- and the offsets of the fading percepts, (ii) auditory reference stimuli lead to higher visible persistence durations than visual ones, (iii) visible persistence duration increases with the thickness of annuli and the diameter of disks, but decreases with the diameter of annuli, irrespective of stimulus eccentricity. These effects cannot be explained by stimulus energy, which suggests that more complex processing mechanisms are involved. Seemingly contradictory effects of disk and annulus diameter can be unified by assuming an abstract filling-in mechanism that speeds up with the strength of the edge signal and takes more time the larger the stimulus area is.  相似文献   

16.
Sequences of higher frequency A and lower frequency B tones repeating in an ABA- triplet pattern are widely used to study auditory streaming. One may experience either an integrated percept, a single ABA-ABA- stream, or a segregated percept, separate but simultaneous streams A-A-A-A- and -B---B--. During minutes-long presentations, subjects may report irregular alternations between these interpretations. We combine neuromechanistic modeling and psychoacoustic experiments to study these persistent alternations and to characterize the effects of manipulating stimulus parameters. Unlike many phenomenological models with abstract, percept-specific competition and fixed inputs, our network model comprises neuronal units with sensory feature dependent inputs that mimic the pulsatile-like A1 responses to tones in the ABA- triplets. It embodies a neuronal computation for percept competition thought to occur beyond primary auditory cortex (A1). Mutual inhibition, adaptation and noise are implemented. We include slow NDMA recurrent excitation for local temporal memory that enables linkage across sound gaps from one triplet to the next. Percepts in our model are identified in the firing patterns of the neuronal units. We predict with the model that manipulations of the frequency difference between tones A and B should affect the dominance durations of the stronger percept, the one dominant a larger fraction of time, more than those of the weaker percept—a property that has been previously established and generalized across several visual bistable paradigms. We confirm the qualitative prediction with our psychoacoustic experiments and use the behavioral data to further constrain and improve the model, achieving quantitative agreement between experimental and modeling results. Our work and model provide a platform that can be extended to consider other stimulus conditions, including the effects of context and volition.  相似文献   

17.

Background

Audition provides important cues with regard to stimulus motion although vision may provide the most salient information. It has been reported that a sound of fixed intensity tends to be judged as decreasing in intensity after adaptation to looming visual stimuli or as increasing in intensity after adaptation to receding visual stimuli. This audiovisual interaction in motion aftereffects indicates that there are multimodal contributions to motion perception at early levels of sensory processing. However, there has been no report that sounds can induce the perception of visual motion.

Methodology/Principal Findings

A visual stimulus blinking at a fixed location was perceived to be moving laterally when the flash onset was synchronized to an alternating left-right sound source. This illusory visual motion was strengthened with an increasing retinal eccentricity (2.5 deg to 20 deg) and occurred more frequently when the onsets of the audio and visual stimuli were synchronized.

Conclusions/Significance

We clearly demonstrated that the alternation of sound location induces illusory visual motion when vision cannot provide accurate spatial information. The present findings strongly suggest that the neural representations of auditory and visual motion processing can bias each other, which yields the best estimates of external events in a complementary manner.  相似文献   

18.
When human subjects hear a sequence of two alternating pure tones, they often perceive it in one of two ways: as one integrated sequence (a single "stream" consisting of the two tones), or as two segregated sequences, one sequence of low tones perceived separately from another sequence of high tones (two "streams"). Perception of this stimulus is thus bistable. Moreover, subjects report on-going switching between the two percepts: unless the frequency separation is large, initial perception tends to be of integration, followed by toggling between integration and segregation phases. The process of stream formation is loosely named “auditory streaming”. Auditory streaming is believed to be a manifestation of human ability to analyze an auditory scene, i.e. to attribute portions of the incoming sound sequence to distinct sound generating entities. Previous studies suggested that the durations of the successive integration and segregation phases are statistically independent. This independence plays an important role in current models of bistability. Contrary to this, we show here, by analyzing a large set of data, that subsequent phase durations are positively correlated. To account together for bistability and positive correlation between subsequent durations, we suggest that streaming is a consequence of an evidence accumulation process. Evidence for segregation is accumulated during the integration phase and vice versa; a switch to the opposite percept occurs stochastically based on this evidence. During a long phase, a large amount of evidence for the opposite percept is accumulated, resulting in a long subsequent phase. In contrast, a short phase is followed by another short phase. We implement these concepts using a probabilistic model that shows both bistability and correlations similar to those observed experimentally.  相似文献   

19.

Background

Synesthesia is a condition in which the stimulation of one sense elicits an additional experience, often in a different (i.e., unstimulated) sense. Although only a small proportion of the population is synesthetic, there is growing evidence to suggest that neurocognitively-normal individuals also experience some form of synesthetic association between the stimuli presented to different sensory modalities (i.e., between auditory pitch and visual size, where lower frequency tones are associated with large objects and higher frequency tones with small objects). While previous research has highlighted crossmodal interactions between synesthetically corresponding dimensions, the possible role of synesthetic associations in multisensory integration has not been considered previously.

Methodology

Here we investigate the effects of synesthetic associations by presenting pairs of asynchronous or spatially discrepant visual and auditory stimuli that were either synesthetically matched or mismatched. In a series of three psychophysical experiments, participants reported the relative temporal order of presentation or the relative spatial locations of the two stimuli.

Principal Findings

The reliability of non-synesthetic participants'' estimates of both audiovisual temporal asynchrony and spatial discrepancy were lower for pairs of synesthetically matched as compared to synesthetically mismatched audiovisual stimuli.

Conclusions

Recent studies of multisensory integration have shown that the reduced reliability of perceptual estimates regarding intersensory conflicts constitutes the marker of a stronger coupling between the unisensory signals. Our results therefore indicate a stronger coupling of synesthetically matched vs. mismatched stimuli and provide the first psychophysical evidence that synesthetic congruency can promote multisensory integration. Synesthetic crossmodal correspondences therefore appear to play a crucial (if unacknowledged) role in the multisensory integration of auditory and visual information.  相似文献   

20.
Visual inputs can distort auditory perception, and accurate auditory processing requires the ability to detect and ignore visual input that is simultaneous and incongruent with auditory information. However, the neural basis of this auditory selection from audiovisual information is unknown, whereas integration process of audiovisual inputs is intensively researched. Here, we tested the hypothesis that the inferior frontal gyrus (IFG) and superior temporal sulcus (STS) are involved in top-down and bottom-up processing, respectively, of target auditory information from audiovisual inputs. We recorded high gamma activity (HGA), which is associated with neuronal firing in local brain regions, using electrocorticography while patients with epilepsy judged the syllable spoken by a voice while looking at a voice-congruent or -incongruent lip movement from the speaker. The STS exhibited stronger HGA if the patient was presented with information of large audiovisual incongruence than of small incongruence, especially if the auditory information was correctly identified. On the other hand, the IFG exhibited stronger HGA in trials with small audiovisual incongruence when patients correctly perceived the auditory information than when patients incorrectly perceived the auditory information due to the mismatched visual information. These results indicate that the IFG and STS have dissociated roles in selective auditory processing, and suggest that the neural basis of selective auditory processing changes dynamically in accordance with the degree of incongruity between auditory and visual information.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号