首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A combination of signals across modalities can facilitate sensory perception. The audiovisual facilitative effect strongly depends on the features of the stimulus. Here, we investigated how sound frequency, which is one of basic features of an auditory signal, modulates audiovisual integration. In this study, the task of the participant was to respond to a visual target stimulus by pressing a key while ignoring auditory stimuli, comprising of tones of different frequencies (0.5, 1, 2.5 and 5 kHz). A significant facilitation of reaction times was obtained following audiovisual stimulation, irrespective of whether the task-irrelevant sounds were low or high frequency. Using event-related potential (ERP), audiovisual integration was found over the occipital area for 0.5 kHz auditory stimuli from 190–210 ms, for 1 kHz stimuli from 170–200 ms, for 2.5 kHz stimuli from 140–200 ms, 5 kHz stimuli from 100–200 ms. These findings suggest that a higher frequency sound signal paired with visual stimuli might be early processed or integrated despite the auditory stimuli being task-irrelevant information. Furthermore, audiovisual integration in late latency (300–340 ms) ERPs with fronto-central topography was found for auditory stimuli of lower frequencies (0.5, 1 and 2.5 kHz). Our results confirmed that audiovisual integration is affected by the frequency of an auditory stimulus. Taken together, the neurophysiological results provide unique insight into how the brain processes a multisensory visual signal and auditory stimuli of different frequencies.  相似文献   

2.
In humans, emotions from music serve important communicative roles. Despite a growing interest in the neural basis of music perception, action and emotion, the majority of previous studies in this area have focused on the auditory aspects of music performances. Here we investigate how the brain processes the emotions elicited by audiovisual music performances. We used event-related functional magnetic resonance imaging, and in Experiment 1 we defined the areas responding to audiovisual (musician's movements with music), visual (musician's movements only), and auditory emotional (music only) displays. Subsequently a region of interest analysis was performed to examine if any of the areas detected in Experiment 1 showed greater activation for emotionally mismatching performances (combining the musician's movements with mismatching emotional sound) than for emotionally matching music performances (combining the musician's movements with matching emotional sound) as presented in Experiment 2 to the same participants. The insula and the left thalamus were found to respond consistently to visual, auditory and audiovisual emotional information and to have increased activation for emotionally mismatching displays in comparison with emotionally matching displays. In contrast, the right thalamus was found to respond to audiovisual emotional displays and to have similar activation for emotionally matching and mismatching displays. These results suggest that the insula and left thalamus have an active role in detecting emotional correspondence between auditory and visual information during music performances, whereas the right thalamus has a different role.  相似文献   

3.
Visual inputs can distort auditory perception, and accurate auditory processing requires the ability to detect and ignore visual input that is simultaneous and incongruent with auditory information. However, the neural basis of this auditory selection from audiovisual information is unknown, whereas integration process of audiovisual inputs is intensively researched. Here, we tested the hypothesis that the inferior frontal gyrus (IFG) and superior temporal sulcus (STS) are involved in top-down and bottom-up processing, respectively, of target auditory information from audiovisual inputs. We recorded high gamma activity (HGA), which is associated with neuronal firing in local brain regions, using electrocorticography while patients with epilepsy judged the syllable spoken by a voice while looking at a voice-congruent or -incongruent lip movement from the speaker. The STS exhibited stronger HGA if the patient was presented with information of large audiovisual incongruence than of small incongruence, especially if the auditory information was correctly identified. On the other hand, the IFG exhibited stronger HGA in trials with small audiovisual incongruence when patients correctly perceived the auditory information than when patients incorrectly perceived the auditory information due to the mismatched visual information. These results indicate that the IFG and STS have dissociated roles in selective auditory processing, and suggest that the neural basis of selective auditory processing changes dynamically in accordance with the degree of incongruity between auditory and visual information.  相似文献   

4.
This article aims to investigate whether auditory stimuli in the horizontal plane, particularly originating from behind the participant, affect audiovisual integration by using behavioral and event-related potential (ERP) measurements. In this study, visual stimuli were presented directly in front of the participants, auditory stimuli were presented at one location in an equidistant horizontal plane at the front (0°, the fixation point), right (90°), back (180°), or left (270°) of the participants, and audiovisual stimuli that include both visual stimuli and auditory stimuli originating from one of the four locations were simultaneously presented. These stimuli were presented randomly with equal probability; during this time, participants were asked to attend to the visual stimulus and respond promptly only to visual target stimuli (a unimodal visual target stimulus and the visual target of the audiovisual stimulus). A significant facilitation of reaction times and hit rates was obtained following audiovisual stimulation, irrespective of whether the auditory stimuli were presented in the front or back of the participant. However, no significant interactions were found between visual stimuli and auditory stimuli from the right or left. Two main ERP components related to audiovisual integration were found: first, auditory stimuli from the front location produced an ERP reaction over the right temporal area and right occipital area at approximately 160–200 milliseconds; second, auditory stimuli from the back produced a reaction over the parietal and occipital areas at approximately 360–400 milliseconds. Our results confirmed that audiovisual integration was also elicited, even though auditory stimuli were presented behind the participant, but no integration occurred when auditory stimuli were presented in the right or left spaces, suggesting that the human brain might be particularly sensitive to information received from behind than both sides.  相似文献   

5.
Li Y  Wang G  Long J  Yu Z  Huang B  Li X  Yu T  Liang C  Li Z  Sun P 《PloS one》2011,6(6):e20801
One of the central questions in cognitive neuroscience is the precise neural representation, or brain pattern, associated with a semantic category. In this study, we explored the influence of audiovisual stimuli on the brain patterns of concepts or semantic categories through a functional magnetic resonance imaging (fMRI) experiment. We used a pattern search method to extract brain patterns corresponding to two semantic categories: "old people" and "young people." These brain patterns were elicited by semantically congruent audiovisual, semantically incongruent audiovisual, unimodal visual, and unimodal auditory stimuli belonging to the two semantic categories. We calculated the reproducibility index, which measures the similarity of the patterns within the same category. We also decoded the semantic categories from these brain patterns. The decoding accuracy reflects the discriminability of the brain patterns between two categories. The results showed that both the reproducibility index of brain patterns and the decoding accuracy were significantly higher for semantically congruent audiovisual stimuli than for unimodal visual and unimodal auditory stimuli, while the semantically incongruent stimuli did not elicit brain patterns with significantly higher reproducibility index or decoding accuracy. Thus, the semantically congruent audiovisual stimuli enhanced the within-class reproducibility of brain patterns and the between-class discriminability of brain patterns, and facilitate neural representations of semantic categories or concepts. Furthermore, we analyzed the brain activity in superior temporal sulcus and middle temporal gyrus (STS/MTG). The strength of the fMRI signal and the reproducibility index were enhanced by the semantically congruent audiovisual stimuli. Our results support the use of the reproducibility index as a potential tool to supplement the fMRI signal amplitude for evaluating multimodal integration.  相似文献   

6.
Hasson U  Skipper JI  Nusbaum HC  Small SL 《Neuron》2007,56(6):1116-1126
Is there a neural representation of speech that transcends its sensory properties? Using fMRI, we investigated whether there are brain areas where neural activity during observation of sublexical audiovisual input corresponds to a listener's speech percept (what is "heard") independent of the sensory properties of the input. A target audiovisual stimulus was preceded by stimuli that (1) shared the target's auditory features (auditory overlap), (2) shared the target's visual features (visual overlap), or (3) shared neither the target's auditory or visual features but were perceived as the target (perceptual overlap). In two left-hemisphere regions (pars opercularis, planum polare), the target invoked less activity when it was preceded by the perceptually overlapping stimulus than when preceded by stimuli that shared one of its sensory components. This pattern of neural facilitation indicates that these regions code sublexical speech at an abstract level corresponding to that of the speech percept.  相似文献   

7.
An increasing number of neuroscience papers capitalize on the assumption published in this journal that visual speech would be typically 150 ms ahead of auditory speech. It happens that the estimation of audiovisual asynchrony in the reference paper is valid only in very specific cases, for isolated consonant-vowel syllables or at the beginning of a speech utterance, in what we call “preparatory gestures”. However, when syllables are chained in sequences, as they are typically in most parts of a natural speech utterance, asynchrony should be defined in a different way. This is what we call “comodulatory gestures” providing auditory and visual events more or less in synchrony. We provide audiovisual data on sequences of plosive-vowel syllables (pa, ta, ka, ba, da, ga, ma, na) showing that audiovisual synchrony is actually rather precise, varying between 20 ms audio lead and 70 ms audio lag. We show how more complex speech material should result in a range typically varying between 40 ms audio lead and 200 ms audio lag, and we discuss how this natural coordination is reflected in the so-called temporal integration window for audiovisual speech perception. Finally we present a toy model of auditory and audiovisual predictive coding, showing that visual lead is actually not necessary for visual prediction.  相似文献   

8.
Research on the neural basis of speech-reading implicates a network of auditory language regions involving inferior frontal cortex, premotor cortex and sites along superior temporal cortex. In audiovisual speech studies, neural activity is consistently reported in posterior superior temporal Sulcus (pSTS) and this site has been implicated in multimodal integration. Traditionally, multisensory interactions are considered high-level processing that engages heteromodal association cortices (such as STS). Recent work, however, challenges this notion and suggests that multisensory interactions may occur in low-level unimodal sensory cortices. While previous audiovisual speech studies demonstrate that high-level multisensory interactions occur in pSTS, what remains unclear is how early in the processing hierarchy these multisensory interactions may occur. The goal of the present fMRI experiment is to investigate how visual speech can influence activity in auditory cortex above and beyond its response to auditory speech. In an audiovisual speech experiment, subjects were presented with auditory speech with and without congruent visual input. Holding the auditory stimulus constant across the experiment, we investigated how the addition of visual speech influences activity in auditory cortex. We demonstrate that congruent visual speech increases the activity in auditory cortex.  相似文献   

9.
Jessen S  Obleser J  Kotz SA 《PloS one》2012,7(4):e36070
Successful social communication draws strongly on the correct interpretation of others' body and vocal expressions. Both can provide emotional information and often occur simultaneously. Yet their interplay has hardly been studied. Using electroencephalography, we investigated the temporal development underlying their neural interaction in auditory and visual perception. In particular, we tested whether this interaction qualifies as true integration following multisensory integration principles such as inverse effectiveness. Emotional vocalizations were embedded in either low or high levels of noise and presented with or without video clips of matching emotional body expressions. In both, high and low noise conditions, a reduction in auditory N100 amplitude was observed for audiovisual stimuli. However, only under high noise, the N100 peaked earlier in the audiovisual than the auditory condition, suggesting facilitatory effects as predicted by the inverse effectiveness principle. Similarly, we observed earlier N100 peaks in response to emotional compared to neutral audiovisual stimuli. This was not the case in the unimodal auditory condition. Furthermore, suppression of beta-band oscillations (15-25 Hz) primarily reflecting biological motion perception was modulated 200-400 ms after the vocalization. While larger differences in suppression between audiovisual and audio stimuli in high compared to low noise levels were found for emotional stimuli, no such difference was observed for neutral stimuli. This observation is in accordance with the inverse effectiveness principle and suggests a modulation of integration by emotional content. Overall, results show that ecologically valid, complex stimuli such as joined body and vocal expressions are effectively integrated very early in processing.  相似文献   

10.
Currently debate exists relating to the interplay between multisensory processes and bottom-up and top-down influences. However, few studies have looked at neural responses to newly paired audiovisual stimuli that differ in their prescribed relevance. For such newly associated audiovisual stimuli, optimal facilitation of motor actions was observed only when both components of the audiovisual stimuli were targets. Relevant auditory stimuli were found to significantly increase the amplitudes of the event-related potentials at the occipital pole during the first 100 ms post-stimulus onset, though this early integration was not predictive of multisensory facilitation. Activity related to multisensory behavioral facilitation was observed approximately 166 ms post-stimulus, at left central and occipital sites. Furthermore, optimal multisensory facilitation was found to be associated with a latency shift of induced oscillations in the beta range (14–30 Hz) at right hemisphere parietal scalp regions. These findings demonstrate the importance of stimulus relevance to multisensory processing by providing the first evidence that the neural processes underlying multisensory integration are modulated by the relevance of the stimuli being combined. We also provide evidence that such facilitation may be mediated by changes in neural synchronization in occipital and centro-parietal neural populations at early and late stages of neural processing that coincided with stimulus selection, and the preparation and initiation of motor action.  相似文献   

11.
The brain is able to realign asynchronous signals that approximately coincide in both space and time. Given that many experience-based links between visual and auditory stimuli are established in the absence of spatiotemporal proximity, we investigated whether or not temporal realignment arises in these conditions. Participants received a 3-min exposure to visual and auditory stimuli that were separated by 706 ms and appeared either from the same (Experiment 1) or from different spatial positions (Experiment 2). A simultaneity judgment task (SJ) was administered right afterwards. Temporal realignment between vision and audition was observed, in both Experiment 1 and 2, when comparing the participants’ SJs after this exposure phase with those obtained after a baseline exposure to audiovisual synchrony. However, this effect was present only when the visual stimuli preceded the auditory stimuli during the exposure to asynchrony. A similar pattern of results (temporal realignment after exposure to visual-leading asynchrony but not after exposure to auditory-leading asynchrony) was obtained using temporal order judgments (TOJs) instead of SJs (Experiment 3). Taken together, these results suggest that temporal recalibration still occurs for visual and auditory stimuli that fall clearly outside the so-called temporal window for multisensory integration and appear from different spatial positions. This temporal realignment may be modulated by long-term experience with the kind of asynchrony (vision-leading) that we most frequently encounter in the outside world (e.g., while perceiving distant events).  相似文献   

12.
Diverse a nimal species use multimodal communica tion signals to coordina te reproductive behavior.Despite active research in this field,the brain mechanisms underlying multimodal communication remain poorly understood.Similar to humans and many mammalian species,anurans often produce auditory signals accompanied by conspicuous visual cues(e.g.,vocal sac inflation).In this study,we used video playbacks to determine the role of vocal-sac inflation in little torrent frogs(Amolops torrentis).Then we exposed females to blank,visual,auditory,and audiovisual stimuli and analyzed whole brain tissue gene expression changes using RNAseq.The results showed that both auditory cues(i.e.,male advertisement calls)and visual cues were attractive to female frogs,although auditory cues were more attractive than visual cues.Females preferred simultaneous bimodal cues to unimodal cues.The hierarchical clustering of differentially expressed genes showed a close relationship between neurogenomic states and momentarily expressed sexual signals.We also found that the Gene Ontology terms and KEGG pathways involved in energy metabolism were mostly increased in blank contrast versus visual,acoustic,or audiovisual stimuli,indicating that brain energy use may play an important role in response to these stimuli.In sum,behavioral and neurogenomic responses to acoustic and visual cues are correlated in female little torrent frogs.  相似文献   

13.
To obtain a coherent perception of the world, our senses need to be in alignment. When we encounter misaligned cues from two sensory modalities, the brain must infer which cue is faulty and recalibrate the corresponding sense. We examined whether and how the brain uses cue reliability to identify the miscalibrated sense by measuring the audiovisual ventriloquism aftereffect for stimuli of varying visual reliability. To adjust for modality-specific biases, visual stimulus locations were chosen based on perceived alignment with auditory stimulus locations for each participant. During an audiovisual recalibration phase, participants were presented with bimodal stimuli with a fixed perceptual spatial discrepancy; they localized one modality, cued after stimulus presentation. Unimodal auditory and visual localization was measured before and after the audiovisual recalibration phase. We compared participants’ behavior to the predictions of three models of recalibration: (a) Reliability-based: each modality is recalibrated based on its relative reliability—less reliable cues are recalibrated more; (b) Fixed-ratio: the degree of recalibration for each modality is fixed; (c) Causal-inference: recalibration is directly determined by the discrepancy between a cue and its estimate, which in turn depends on the reliability of both cues, and inference about how likely the two cues derive from a common source. Vision was hardly recalibrated by audition. Auditory recalibration by vision changed idiosyncratically as visual reliability decreased: the extent of auditory recalibration either decreased monotonically, peaked at medium visual reliability, or increased monotonically. The latter two patterns cannot be explained by either the reliability-based or fixed-ratio models. Only the causal-inference model of recalibration captures the idiosyncratic influences of cue reliability on recalibration. We conclude that cue reliability, causal inference, and modality-specific biases guide cross-modal recalibration indirectly by determining the perception of audiovisual stimuli.  相似文献   

14.
Given that both auditory and visual systems have anatomically separate object identification ("what") and spatial ("where") pathways, it is of interest whether attention-driven cross-sensory modulations occur separately within these feature domains. Here, we investigated how auditory "what" vs. "where" attention tasks modulate activity in visual pathways using cortically constrained source estimates of magnetoencephalograpic (MEG) oscillatory activity. In the absence of visual stimuli or tasks, subjects were presented with a sequence of auditory-stimulus pairs and instructed to selectively attend to phonetic ("what") vs. spatial ("where") aspects of these sounds, or to listen passively. To investigate sustained modulatory effects, oscillatory power was estimated from time periods between sound-pair presentations. In comparison to attention to sound locations, phonetic auditory attention was associated with stronger alpha (7-13 Hz) power in several visual areas (primary visual cortex; lingual, fusiform, and inferior temporal gyri, lateral occipital cortex), as well as in higher-order visual/multisensory areas including lateral/medial parietal and retrosplenial cortices. Region-of-interest (ROI) analyses of dynamic changes, from which the sustained effects had been removed, suggested further power increases during Attend Phoneme vs. Location centered at the alpha range 400-600 ms after the onset of second sound of each stimulus pair. These results suggest distinct modulations of visual system oscillatory activity during auditory attention to sound object identity ("what") vs. sound location ("where"). The alpha modulations could be interpreted to reflect enhanced crossmodal inhibition of feature-specific visual pathways and adjacent audiovisual association areas during "what" vs. "where" auditory attention.  相似文献   

15.
When correlation implies causation in multisensory integration   总被引:1,自引:0,他引:1  
Inferring which signals have a common underlying cause, and hence should be integrated, represents a primary challenge for a perceptual system dealing with multiple sensory inputs [1-3]. This challenge is often referred to as the correspondence problem or causal inference. Previous research has demonstrated that spatiotemporal cues, along with prior knowledge, are exploited by the human brain to solve this problem [4-9]. Here we explore the role of correlation between the fine temporal structure of auditory and visual signals in causal inference. Specifically, we investigated whether correlated signals are inferred to originate from the same distal event and hence are integrated optimally [10]. In a localization task with visual, auditory, and combined audiovisual targets, the improvement in precision for combined relative to unimodal targets was statistically optimal only when audiovisual signals were correlated. This result demonstrates that humans use the similarity in the temporal structure of multisensory signals to solve the correspondence problem, hence inferring causation from correlation.  相似文献   

16.
Ku Y  Hong B  Zhou W  Bodner M  Zhou YD 《PloS one》2012,7(5):e36410
Abacus experts are able to mentally calculate multi-digit numbers rapidly. Some behavioral and neuroimaging studies have suggested a visuospatial and visuomotor strategy during abacus mental calculation. However, no study up to now has attempted to dissociate temporally the visuospatial neural process from the visuomotor neural process during abacus mental calculation. In the present study, an abacus expert performed the mental addition tasks (8-digit and 4-digit addends presented in visual or auditory modes) swiftly and accurately. The 100% correct rates in this expert's task performance were significantly higher than those of ordinary subjects performing 1-digit and 2-digit addition tasks. ERPs, EEG source localizations, and fMRI results taken together suggested visuospatial and visuomotor processes were sequentially arranged during the abacus mental addition with visual addends and could be dissociated from each other temporally. The visuospatial transformation of the numbers, in which the superior parietal lobule was most likely involved, might occur first (around 380 ms) after the onset of the stimuli. The visuomotor processing, in which the superior/middle frontal gyri were most likely involved, might occur later (around 440 ms). Meanwhile, fMRI results suggested that neural networks involved in the abacus mental addition with auditory stimuli were similar to those in the visual abacus mental addition. The most prominently activated brain areas in both conditions included the bilateral superior parietal lobules (BA 7) and bilateral middle frontal gyri (BA 6). These results suggest a supra-modal brain network in abacus mental addition, which may develop from normal mental calculation networks.  相似文献   

17.
It has traditionally been assumed that cochlear implant users de facto perform atypically in audiovisual tasks. However, a recent study that combined an auditory task with visual distractors suggests that only those cochlear implant users that are not proficient at recognizing speech sounds might show abnormal audiovisual interactions. The present study aims at reinforcing this notion by investigating the audiovisual segregation abilities of cochlear implant users in a visual task with auditory distractors. Speechreading was assessed in two groups of cochlear implant users (proficient and non-proficient at sound recognition), as well as in normal controls. A visual speech recognition task (i.e. speechreading) was administered either in silence or in combination with three types of auditory distractors: i) noise ii) reverse speech sound and iii) non-altered speech sound. Cochlear implant users proficient at speech recognition performed like normal controls in all conditions, whereas non-proficient users showed significantly different audiovisual segregation patterns in both speech conditions. These results confirm that normal-like audiovisual segregation is possible in highly skilled cochlear implant users and, consequently, that proficient and non-proficient CI users cannot be lumped into a single group. This important feature must be taken into account in further studies of audiovisual interactions in cochlear implant users.  相似文献   

18.
The brain is adaptive. The speed of propagation through air, and of low-level sensory processing, differs markedly between auditory and visual stimuli; yet the brain can adapt to compensate for the resulting cross-modal delays. Studies investigating temporal recalibration to audiovisual speech have used prolonged adaptation procedures, suggesting that adaptation is sluggish. Here, we show that adaptation to asynchronous audiovisual speech occurs rapidly. Participants viewed a brief clip of an actor pronouncing a single syllable. The voice was either advanced or delayed relative to the corresponding lip movements, and participants were asked to make a synchrony judgement. Although we did not use an explicit adaptation procedure, we demonstrate rapid recalibration based on a single audiovisual event. We find that the point of subjective simultaneity on each trial is highly contingent upon the modality order of the preceding trial. We find compelling evidence that rapid recalibration generalizes across different stimuli, and different actors. Finally, we demonstrate that rapid recalibration occurs even when auditory and visual events clearly belong to different actors. These results suggest that rapid temporal recalibration to audiovisual speech is primarily mediated by basic temporal factors, rather than higher-order factors such as perceived simultaneity and source identity.  相似文献   

19.
Rapid integration of biologically relevant information is crucial for the survival of an organism. Most prominently, humans should be biased to attend and respond to looming stimuli that signal approaching danger (e.g. predator) and hence require rapid action. This psychophysics study used binocular rivalry to investigate the perceptual advantage of looming (relative to receding) visual signals (i.e. looming bias) and how this bias can be influenced by concurrent auditory looming/receding stimuli and the statistical structure of the auditory and visual signals.Subjects were dichoptically presented with looming/receding visual stimuli that were paired with looming or receding sounds. The visual signals conformed to two different statistical structures: (1) a ‘simple’ random-dot kinematogram showing a starfield and (2) a “naturalistic” visual Shepard stimulus. Likewise, the looming/receding sound was (1) a simple amplitude- and frequency-modulated (AM-FM) tone or (2) a complex Shepard tone. Our results show that the perceptual looming bias (i.e. the increase in dominance times for looming versus receding percepts) is amplified by looming sounds, yet reduced and even converted into a receding bias by receding sounds. Moreover, the influence of looming/receding sounds on the visual looming bias depends on the statistical structure of both the visual and auditory signals. It is enhanced when audiovisual signals are Shepard stimuli.In conclusion, visual perception prioritizes processing of biologically significant looming stimuli especially when paired with looming auditory signals. Critically, these audiovisual interactions are amplified for statistically complex signals that are more naturalistic and known to engage neural processing at multiple levels of the cortical hierarchy.  相似文献   

20.
Poghosyan V  Ioannides AA 《Neuron》2008,58(5):802-813
A fundamental question about the neural correlates of attention concerns the earliest sensory processing stage that it can affect. We addressed this issue by recording magnetoencephalography (MEG) signals while subjects performed detection tasks, which required employment of spatial or nonspatial attention, in auditory or visual modality. Using distributed source analysis of MEG signals, we found that, contrary to previous studies that used equivalent current dipole (ECD) analysis, spatial attention enhanced the initial feedforward response in the primary visual cortex (V1) at 55-90 ms. We also found attentional modulation of the putative primary auditory cortex (A1) activity at 30-50 ms. Furthermore, we reproduced our findings using ECD modeling guided by the results of distributed source analysis and suggest a reason why earlier studies using ECD analysis failed to identify the modulation of earliest V1 activity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号