首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
To form a percept of the multisensory world, the brain needs to integrate signals from common sources weighted by their reliabilities and segregate those from independent sources. Previously, we have shown that anterior parietal cortices combine sensory signals into representations that take into account the signals’ causal structure (i.e., common versus independent sources) and their sensory reliabilities as predicted by Bayesian causal inference. The current study asks to what extent and how attentional mechanisms can actively control how sensory signals are combined for perceptual inference. In a pre- and postcueing paradigm, we presented observers with audiovisual signals at variable spatial disparities. Observers were precued to attend to auditory or visual modalities prior to stimulus presentation and postcued to report their perceived auditory or visual location. Combining psychophysics, functional magnetic resonance imaging (fMRI), and Bayesian modelling, we demonstrate that the brain moulds multisensory inference via two distinct mechanisms. Prestimulus attention to vision enhances the reliability and influence of visual inputs on spatial representations in visual and posterior parietal cortices. Poststimulus report determines how parietal cortices flexibly combine sensory estimates into spatial representations consistent with Bayesian causal inference. Our results show that distinct neural mechanisms control how signals are combined for perceptual inference at different levels of the cortical hierarchy.

A combination of psychophysics, computational modelling and fMRI reveals novel insights into how the brain controls the binding of information across the senses, such as the voice and lip movements of a speaker.  相似文献   

2.
To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the “causal inference problem.” Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI), and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation). At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion). Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.  相似文献   

3.
Temporal aspects of the perceptual integration of audiovisual information were investigated by utilizing the visual "streaming-bouncing" phenomenon. When two identical visual objects move towards each other, coincide, and then move away from each other, the objects can either be seen as streaming past one another or bouncing off each other. Although the streaming percept is dominant, the bouncing percept can be induced by presenting an auditory stimulus during the visual coincidence of the moving objects. Here we show that the bounce-inducing effect of the auditory stimulus is paramount when its onset and offset occur in temporal proximity of the onset and offset of the period of visual coincidence of the moving objects. When the duration of the auditory stimulus exceeded this period, visual bouncing disappears. Implications for a temporal window of audiovisual integration and the design of effective audiovisual warning signals are discussed.  相似文献   

4.
Rapid integration of biologically relevant information is crucial for the survival of an organism. Most prominently, humans should be biased to attend and respond to looming stimuli that signal approaching danger (e.g. predator) and hence require rapid action. This psychophysics study used binocular rivalry to investigate the perceptual advantage of looming (relative to receding) visual signals (i.e. looming bias) and how this bias can be influenced by concurrent auditory looming/receding stimuli and the statistical structure of the auditory and visual signals.Subjects were dichoptically presented with looming/receding visual stimuli that were paired with looming or receding sounds. The visual signals conformed to two different statistical structures: (1) a ‘simple’ random-dot kinematogram showing a starfield and (2) a “naturalistic” visual Shepard stimulus. Likewise, the looming/receding sound was (1) a simple amplitude- and frequency-modulated (AM-FM) tone or (2) a complex Shepard tone. Our results show that the perceptual looming bias (i.e. the increase in dominance times for looming versus receding percepts) is amplified by looming sounds, yet reduced and even converted into a receding bias by receding sounds. Moreover, the influence of looming/receding sounds on the visual looming bias depends on the statistical structure of both the visual and auditory signals. It is enhanced when audiovisual signals are Shepard stimuli.In conclusion, visual perception prioritizes processing of biologically significant looming stimuli especially when paired with looming auditory signals. Critically, these audiovisual interactions are amplified for statistically complex signals that are more naturalistic and known to engage neural processing at multiple levels of the cortical hierarchy.  相似文献   

5.
6.
To obtain a coherent perception of the world, our senses need to be in alignment. When we encounter misaligned cues from two sensory modalities, the brain must infer which cue is faulty and recalibrate the corresponding sense. We examined whether and how the brain uses cue reliability to identify the miscalibrated sense by measuring the audiovisual ventriloquism aftereffect for stimuli of varying visual reliability. To adjust for modality-specific biases, visual stimulus locations were chosen based on perceived alignment with auditory stimulus locations for each participant. During an audiovisual recalibration phase, participants were presented with bimodal stimuli with a fixed perceptual spatial discrepancy; they localized one modality, cued after stimulus presentation. Unimodal auditory and visual localization was measured before and after the audiovisual recalibration phase. We compared participants’ behavior to the predictions of three models of recalibration: (a) Reliability-based: each modality is recalibrated based on its relative reliability—less reliable cues are recalibrated more; (b) Fixed-ratio: the degree of recalibration for each modality is fixed; (c) Causal-inference: recalibration is directly determined by the discrepancy between a cue and its estimate, which in turn depends on the reliability of both cues, and inference about how likely the two cues derive from a common source. Vision was hardly recalibrated by audition. Auditory recalibration by vision changed idiosyncratically as visual reliability decreased: the extent of auditory recalibration either decreased monotonically, peaked at medium visual reliability, or increased monotonically. The latter two patterns cannot be explained by either the reliability-based or fixed-ratio models. Only the causal-inference model of recalibration captures the idiosyncratic influences of cue reliability on recalibration. We conclude that cue reliability, causal inference, and modality-specific biases guide cross-modal recalibration indirectly by determining the perception of audiovisual stimuli.  相似文献   

7.
Diverse a nimal species use multimodal communica tion signals to coordina te reproductive behavior.Despite active research in this field,the brain mechanisms underlying multimodal communication remain poorly understood.Similar to humans and many mammalian species,anurans often produce auditory signals accompanied by conspicuous visual cues(e.g.,vocal sac inflation).In this study,we used video playbacks to determine the role of vocal-sac inflation in little torrent frogs(Amolops torrentis).Then we exposed females to blank,visual,auditory,and audiovisual stimuli and analyzed whole brain tissue gene expression changes using RNAseq.The results showed that both auditory cues(i.e.,male advertisement calls)and visual cues were attractive to female frogs,although auditory cues were more attractive than visual cues.Females preferred simultaneous bimodal cues to unimodal cues.The hierarchical clustering of differentially expressed genes showed a close relationship between neurogenomic states and momentarily expressed sexual signals.We also found that the Gene Ontology terms and KEGG pathways involved in energy metabolism were mostly increased in blank contrast versus visual,acoustic,or audiovisual stimuli,indicating that brain energy use may play an important role in response to these stimuli.In sum,behavioral and neurogenomic responses to acoustic and visual cues are correlated in female little torrent frogs.  相似文献   

8.
A combination of signals across modalities can facilitate sensory perception. The audiovisual facilitative effect strongly depends on the features of the stimulus. Here, we investigated how sound frequency, which is one of basic features of an auditory signal, modulates audiovisual integration. In this study, the task of the participant was to respond to a visual target stimulus by pressing a key while ignoring auditory stimuli, comprising of tones of different frequencies (0.5, 1, 2.5 and 5 kHz). A significant facilitation of reaction times was obtained following audiovisual stimulation, irrespective of whether the task-irrelevant sounds were low or high frequency. Using event-related potential (ERP), audiovisual integration was found over the occipital area for 0.5 kHz auditory stimuli from 190–210 ms, for 1 kHz stimuli from 170–200 ms, for 2.5 kHz stimuli from 140–200 ms, 5 kHz stimuli from 100–200 ms. These findings suggest that a higher frequency sound signal paired with visual stimuli might be early processed or integrated despite the auditory stimuli being task-irrelevant information. Furthermore, audiovisual integration in late latency (300–340 ms) ERPs with fronto-central topography was found for auditory stimuli of lower frequencies (0.5, 1 and 2.5 kHz). Our results confirmed that audiovisual integration is affected by the frequency of an auditory stimulus. Taken together, the neurophysiological results provide unique insight into how the brain processes a multisensory visual signal and auditory stimuli of different frequencies.  相似文献   

9.
Research on the neural basis of speech-reading implicates a network of auditory language regions involving inferior frontal cortex, premotor cortex and sites along superior temporal cortex. In audiovisual speech studies, neural activity is consistently reported in posterior superior temporal Sulcus (pSTS) and this site has been implicated in multimodal integration. Traditionally, multisensory interactions are considered high-level processing that engages heteromodal association cortices (such as STS). Recent work, however, challenges this notion and suggests that multisensory interactions may occur in low-level unimodal sensory cortices. While previous audiovisual speech studies demonstrate that high-level multisensory interactions occur in pSTS, what remains unclear is how early in the processing hierarchy these multisensory interactions may occur. The goal of the present fMRI experiment is to investigate how visual speech can influence activity in auditory cortex above and beyond its response to auditory speech. In an audiovisual speech experiment, subjects were presented with auditory speech with and without congruent visual input. Holding the auditory stimulus constant across the experiment, we investigated how the addition of visual speech influences activity in auditory cortex. We demonstrate that congruent visual speech increases the activity in auditory cortex.  相似文献   

10.
Seitz AR  Kim R  Shams L 《Current biology : CB》2006,16(14):1422-1427
Numerous studies show that practice can result in performance improvements on low-level visual perceptual tasks [1-5]. However, such learning is characteristically difficult and slow, requiring many days of training [6-8]. Here, we show that a multisensory audiovisual training procedure facilitates visual learning and results in significantly faster learning than unisensory visual training. We trained one group of subjects with an audiovisual motion-detection task and a second group with a visual motion-detection task, and compared performance on trials containing only visual signals across ten days of training. Whereas observers in both groups showed improvements of visual sensitivity with training, subjects trained with multisensory stimuli showed significantly more learning both within and across training sessions. These benefits of multisensory training are particularly surprising given that the learning of visual motion stimuli is generally thought to be mediated by low-level visual brain areas [6, 9, 10]. Although crossmodal interactions are ubiquitous in human perceptual processing [11-13], the contribution of crossmodal information to perceptual learning has not been studied previously. Our results show that multisensory interactions can be exploited to yield more efficient learning of sensory information and suggest that multisensory training programs would be most effective for the acquisition of new skills.  相似文献   

11.
Visual inputs can distort auditory perception, and accurate auditory processing requires the ability to detect and ignore visual input that is simultaneous and incongruent with auditory information. However, the neural basis of this auditory selection from audiovisual information is unknown, whereas integration process of audiovisual inputs is intensively researched. Here, we tested the hypothesis that the inferior frontal gyrus (IFG) and superior temporal sulcus (STS) are involved in top-down and bottom-up processing, respectively, of target auditory information from audiovisual inputs. We recorded high gamma activity (HGA), which is associated with neuronal firing in local brain regions, using electrocorticography while patients with epilepsy judged the syllable spoken by a voice while looking at a voice-congruent or -incongruent lip movement from the speaker. The STS exhibited stronger HGA if the patient was presented with information of large audiovisual incongruence than of small incongruence, especially if the auditory information was correctly identified. On the other hand, the IFG exhibited stronger HGA in trials with small audiovisual incongruence when patients correctly perceived the auditory information than when patients incorrectly perceived the auditory information due to the mismatched visual information. These results indicate that the IFG and STS have dissociated roles in selective auditory processing, and suggest that the neural basis of selective auditory processing changes dynamically in accordance with the degree of incongruity between auditory and visual information.  相似文献   

12.
Neural basis of the ventriloquist illusion   总被引:1,自引:0,他引:1  
The ventriloquist creates the illusion that his or her voice emerges from the visibly moving mouth of the puppet [1]. This well-known illusion exemplifies a basic principle of how auditory and visual information is integrated in the brain to form a unified multimodal percept. When auditory and visual stimuli occur simultaneously at different locations, the more spatially precise visual information dominates the perceived location of the multimodal event. Previous studies have examined neural interactions between spatially disparate auditory and visual stimuli [2-5], but none has found evidence for a visual influence on the auditory cortex that could be directly linked to the illusion of a shifted auditory percept. Here we utilized event-related brain potentials combined with event-related functional magnetic resonance imaging to demonstrate on a trial-by-trial basis that a precisely timed biasing of the left-right balance of auditory cortex activity by the discrepant visual input underlies the ventriloquist illusion. This cortical biasing may reflect a fundamental mechanism for integrating the auditory and visual components of environmental events, which ensures that the sounds are adaptively localized to the more reliable position provided by the visual input.  相似文献   

13.
This article aims to investigate whether auditory stimuli in the horizontal plane, particularly originating from behind the participant, affect audiovisual integration by using behavioral and event-related potential (ERP) measurements. In this study, visual stimuli were presented directly in front of the participants, auditory stimuli were presented at one location in an equidistant horizontal plane at the front (0°, the fixation point), right (90°), back (180°), or left (270°) of the participants, and audiovisual stimuli that include both visual stimuli and auditory stimuli originating from one of the four locations were simultaneously presented. These stimuli were presented randomly with equal probability; during this time, participants were asked to attend to the visual stimulus and respond promptly only to visual target stimuli (a unimodal visual target stimulus and the visual target of the audiovisual stimulus). A significant facilitation of reaction times and hit rates was obtained following audiovisual stimulation, irrespective of whether the auditory stimuli were presented in the front or back of the participant. However, no significant interactions were found between visual stimuli and auditory stimuli from the right or left. Two main ERP components related to audiovisual integration were found: first, auditory stimuli from the front location produced an ERP reaction over the right temporal area and right occipital area at approximately 160–200 milliseconds; second, auditory stimuli from the back produced a reaction over the parietal and occipital areas at approximately 360–400 milliseconds. Our results confirmed that audiovisual integration was also elicited, even though auditory stimuli were presented behind the participant, but no integration occurred when auditory stimuli were presented in the right or left spaces, suggesting that the human brain might be particularly sensitive to information received from behind than both sides.  相似文献   

14.
The brain is able to realign asynchronous signals that approximately coincide in both space and time. Given that many experience-based links between visual and auditory stimuli are established in the absence of spatiotemporal proximity, we investigated whether or not temporal realignment arises in these conditions. Participants received a 3-min exposure to visual and auditory stimuli that were separated by 706 ms and appeared either from the same (Experiment 1) or from different spatial positions (Experiment 2). A simultaneity judgment task (SJ) was administered right afterwards. Temporal realignment between vision and audition was observed, in both Experiment 1 and 2, when comparing the participants’ SJs after this exposure phase with those obtained after a baseline exposure to audiovisual synchrony. However, this effect was present only when the visual stimuli preceded the auditory stimuli during the exposure to asynchrony. A similar pattern of results (temporal realignment after exposure to visual-leading asynchrony but not after exposure to auditory-leading asynchrony) was obtained using temporal order judgments (TOJs) instead of SJs (Experiment 3). Taken together, these results suggest that temporal recalibration still occurs for visual and auditory stimuli that fall clearly outside the so-called temporal window for multisensory integration and appear from different spatial positions. This temporal realignment may be modulated by long-term experience with the kind of asynchrony (vision-leading) that we most frequently encounter in the outside world (e.g., while perceiving distant events).  相似文献   

15.
We performed a systematic study to check whether neurons in the area TE (the anterior part of inferotemporal cortex) in rhesus monkey, regarded as the last stage of the ventral visual pathway, could be modulated by auditory stimuli. Two fixating rhesus monkeys were presented with visual, auditory or combined audiovisual stimuli while neuronal responses were recorded. We have found that the visually sensitive neurons are also modulated by audiovisual stimuli. This modulation is manifested as the change of response rate. Our results have shown also that the visual neurons were responsive to the sole auditory stimuli. Therefore, the concept of inferotemporal cortex unimodality in information processing should be re-evaluated.  相似文献   

16.
In humans, emotions from music serve important communicative roles. Despite a growing interest in the neural basis of music perception, action and emotion, the majority of previous studies in this area have focused on the auditory aspects of music performances. Here we investigate how the brain processes the emotions elicited by audiovisual music performances. We used event-related functional magnetic resonance imaging, and in Experiment 1 we defined the areas responding to audiovisual (musician's movements with music), visual (musician's movements only), and auditory emotional (music only) displays. Subsequently a region of interest analysis was performed to examine if any of the areas detected in Experiment 1 showed greater activation for emotionally mismatching performances (combining the musician's movements with mismatching emotional sound) than for emotionally matching music performances (combining the musician's movements with matching emotional sound) as presented in Experiment 2 to the same participants. The insula and the left thalamus were found to respond consistently to visual, auditory and audiovisual emotional information and to have increased activation for emotionally mismatching displays in comparison with emotionally matching displays. In contrast, the right thalamus was found to respond to audiovisual emotional displays and to have similar activation for emotionally matching and mismatching displays. These results suggest that the insula and left thalamus have an active role in detecting emotional correspondence between auditory and visual information during music performances, whereas the right thalamus has a different role.  相似文献   

17.
At any given moment, our brain processes multiple inputs from its different sensory modalities (vision, hearing, touch, etc.). In deciphering this array of sensory information, the brain has to solve two problems: (1) which of the inputs originate from the same object and should be integrated and (2) for the sensations originating from the same object, how best to integrate them. Recent behavioural studies suggest that the human brain solves these problems using optimal probabilistic inference, known as Bayesian causal inference. However, how and where the underlying computations are carried out in the brain have remained unknown. By combining neuroimaging-based decoding techniques and computational modelling of behavioural data, a new study now sheds light on how multisensory causal inference maps onto specific brain areas. The results suggest that the complexity of neural computations increases along the visual hierarchy and link specific components of the causal inference process with specific visual and parietal regions.  相似文献   

18.
An increasing number of neuroscience papers capitalize on the assumption published in this journal that visual speech would be typically 150 ms ahead of auditory speech. It happens that the estimation of audiovisual asynchrony in the reference paper is valid only in very specific cases, for isolated consonant-vowel syllables or at the beginning of a speech utterance, in what we call “preparatory gestures”. However, when syllables are chained in sequences, as they are typically in most parts of a natural speech utterance, asynchrony should be defined in a different way. This is what we call “comodulatory gestures” providing auditory and visual events more or less in synchrony. We provide audiovisual data on sequences of plosive-vowel syllables (pa, ta, ka, ba, da, ga, ma, na) showing that audiovisual synchrony is actually rather precise, varying between 20 ms audio lead and 70 ms audio lag. We show how more complex speech material should result in a range typically varying between 40 ms audio lead and 200 ms audio lag, and we discuss how this natural coordination is reflected in the so-called temporal integration window for audiovisual speech perception. Finally we present a toy model of auditory and audiovisual predictive coding, showing that visual lead is actually not necessary for visual prediction.  相似文献   

19.
The brain is adaptive. The speed of propagation through air, and of low-level sensory processing, differs markedly between auditory and visual stimuli; yet the brain can adapt to compensate for the resulting cross-modal delays. Studies investigating temporal recalibration to audiovisual speech have used prolonged adaptation procedures, suggesting that adaptation is sluggish. Here, we show that adaptation to asynchronous audiovisual speech occurs rapidly. Participants viewed a brief clip of an actor pronouncing a single syllable. The voice was either advanced or delayed relative to the corresponding lip movements, and participants were asked to make a synchrony judgement. Although we did not use an explicit adaptation procedure, we demonstrate rapid recalibration based on a single audiovisual event. We find that the point of subjective simultaneity on each trial is highly contingent upon the modality order of the preceding trial. We find compelling evidence that rapid recalibration generalizes across different stimuli, and different actors. Finally, we demonstrate that rapid recalibration occurs even when auditory and visual events clearly belong to different actors. These results suggest that rapid temporal recalibration to audiovisual speech is primarily mediated by basic temporal factors, rather than higher-order factors such as perceived simultaneity and source identity.  相似文献   

20.
It has traditionally been assumed that cochlear implant users de facto perform atypically in audiovisual tasks. However, a recent study that combined an auditory task with visual distractors suggests that only those cochlear implant users that are not proficient at recognizing speech sounds might show abnormal audiovisual interactions. The present study aims at reinforcing this notion by investigating the audiovisual segregation abilities of cochlear implant users in a visual task with auditory distractors. Speechreading was assessed in two groups of cochlear implant users (proficient and non-proficient at sound recognition), as well as in normal controls. A visual speech recognition task (i.e. speechreading) was administered either in silence or in combination with three types of auditory distractors: i) noise ii) reverse speech sound and iii) non-altered speech sound. Cochlear implant users proficient at speech recognition performed like normal controls in all conditions, whereas non-proficient users showed significantly different audiovisual segregation patterns in both speech conditions. These results confirm that normal-like audiovisual segregation is possible in highly skilled cochlear implant users and, consequently, that proficient and non-proficient CI users cannot be lumped into a single group. This important feature must be taken into account in further studies of audiovisual interactions in cochlear implant users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号