首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Research on the neural basis of speech-reading implicates a network of auditory language regions involving inferior frontal cortex, premotor cortex and sites along superior temporal cortex. In audiovisual speech studies, neural activity is consistently reported in posterior superior temporal Sulcus (pSTS) and this site has been implicated in multimodal integration. Traditionally, multisensory interactions are considered high-level processing that engages heteromodal association cortices (such as STS). Recent work, however, challenges this notion and suggests that multisensory interactions may occur in low-level unimodal sensory cortices. While previous audiovisual speech studies demonstrate that high-level multisensory interactions occur in pSTS, what remains unclear is how early in the processing hierarchy these multisensory interactions may occur. The goal of the present fMRI experiment is to investigate how visual speech can influence activity in auditory cortex above and beyond its response to auditory speech. In an audiovisual speech experiment, subjects were presented with auditory speech with and without congruent visual input. Holding the auditory stimulus constant across the experiment, we investigated how the addition of visual speech influences activity in auditory cortex. We demonstrate that congruent visual speech increases the activity in auditory cortex.  相似文献   

2.
Auditory feedback is required to maintain fluent speech. At present, it is unclear how attention modulates auditory feedback processing during ongoing speech. In this event-related potential (ERP) study, participants vocalized/a/, while they heard their vocal pitch suddenly shifted downward a ½ semitone in both single and dual-task conditions. During the single-task condition participants passively viewed a visual stream for cues to start and stop vocalizing. In the dual-task condition, participants vocalized while they identified target stimuli in a visual stream of letters. The presentation rate of the visual stimuli was manipulated in the dual-task condition in order to produce a low, intermediate, and high attentional load. Visual target identification accuracy was lowest in the high attentional load condition, indicating that attentional load was successfully manipulated. Results further showed that participants who were exposed to the single-task condition, prior to the dual-task condition, produced larger vocal compensations during the single-task condition. Thus, when participants’ attention was divided, less attention was available for the monitoring of their auditory feedback, resulting in smaller compensatory vocal responses. However, P1-N1-P2 ERP responses were not affected by divided attention, suggesting that the effect of attentional load was not on the auditory processing of pitch altered feedback, but instead it interfered with the integration of auditory and motor information, or motor control itself.  相似文献   

3.
Humans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it''s been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2–7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver.  相似文献   

4.
Jessen S  Obleser J  Kotz SA 《PloS one》2012,7(4):e36070
Successful social communication draws strongly on the correct interpretation of others' body and vocal expressions. Both can provide emotional information and often occur simultaneously. Yet their interplay has hardly been studied. Using electroencephalography, we investigated the temporal development underlying their neural interaction in auditory and visual perception. In particular, we tested whether this interaction qualifies as true integration following multisensory integration principles such as inverse effectiveness. Emotional vocalizations were embedded in either low or high levels of noise and presented with or without video clips of matching emotional body expressions. In both, high and low noise conditions, a reduction in auditory N100 amplitude was observed for audiovisual stimuli. However, only under high noise, the N100 peaked earlier in the audiovisual than the auditory condition, suggesting facilitatory effects as predicted by the inverse effectiveness principle. Similarly, we observed earlier N100 peaks in response to emotional compared to neutral audiovisual stimuli. This was not the case in the unimodal auditory condition. Furthermore, suppression of beta-band oscillations (15-25 Hz) primarily reflecting biological motion perception was modulated 200-400 ms after the vocalization. While larger differences in suppression between audiovisual and audio stimuli in high compared to low noise levels were found for emotional stimuli, no such difference was observed for neutral stimuli. This observation is in accordance with the inverse effectiveness principle and suggests a modulation of integration by emotional content. Overall, results show that ecologically valid, complex stimuli such as joined body and vocal expressions are effectively integrated very early in processing.  相似文献   

5.
Successful integration of various simultaneously perceived perceptual signals is crucial for social behavior. Recent findings indicate that this multisensory integration (MSI) can be modulated by attention. Theories of Autism Spectrum Disorders (ASDs) suggest that MSI is affected in this population while it remains unclear to what extent this is related to impairments in attentional capacity. In the present study Event-related potentials (ERPs) following emotionally congruent and incongruent face-voice pairs were measured in 23 high-functioning, adult ASD individuals and 24 age- and IQ-matched controls. MSI was studied while the attention of the participants was manipulated. ERPs were measured at typical auditory and visual processing peaks, namely, P2 and N170. While controls showed MSI during divided attention and easy selective attention tasks, individuals with ASD showed MSI during easy selective attention tasks only. It was concluded that individuals with ASD are able to process multisensory emotional stimuli, but this is differently modulated by attention mechanisms in these participants, especially those associated with divided attention. This atypical interaction between attention and MSI is also relevant to treatment strategies, with training of multisensory attentional control possibly being more beneficial than conventional sensory integration therapy.  相似文献   

6.
Speech perception often benefits from vision of the speaker's lip movements when they are available. One potential mechanism underlying this reported gain in perception arising from audio-visual integration is on-line prediction. In this study we address whether the preceding speech context in a single modality can improve audiovisual processing and whether this improvement is based on on-line information-transfer across sensory modalities. In the experiments presented here, during each trial, a speech fragment (context) presented in a single sensory modality (voice or lips) was immediately continued by an audiovisual target fragment. Participants made speeded judgments about whether voice and lips were in agreement in the target fragment. The leading single sensory context and the subsequent audiovisual target fragment could be continuous in either one modality only, both (context in one modality continues into both modalities in the target fragment) or neither modalities (i.e., discontinuous). The results showed quicker audiovisual matching responses when context was continuous with the target within either the visual or auditory channel (Experiment 1). Critically, prior visual context also provided an advantage when it was cross-modally continuous (with the auditory channel in the target), but auditory to visual cross-modal continuity resulted in no advantage (Experiment 2). This suggests that visual speech information can provide an on-line benefit for processing the upcoming auditory input through the use of predictive mechanisms. We hypothesize that this benefit is expressed at an early level of speech analysis.  相似文献   

7.
Town SM  McCabe BJ 《PloS one》2011,6(3):e17777
Many organisms sample their environment through multiple sensory systems and the integration of multisensory information enhances learning. However, the mechanisms underlying multisensory memory formation and their similarity to unisensory mechanisms remain unclear. Filial imprinting is one example in which experience is multisensory, and the mechanisms of unisensory neuronal plasticity are well established. We investigated the storage of audiovisual information through experience by comparing the activity of neurons in the intermediate and medial mesopallium of imprinted and naïve domestic chicks (Gallus gallus domesticus) in response to an audiovisual imprinting stimulus and novel object and their auditory and visual components. We find that imprinting enhanced the mean response magnitude of neurons to unisensory but not multisensory stimuli. Furthermore, imprinting enhanced responses to incongruent audiovisual stimuli comprised of mismatched auditory and visual components. Our results suggest that the effects of imprinting on the unisensory and multisensory responsiveness of IMM neurons differ and that IMM neurons may function to detect unexpected deviations from the audiovisual imprinting stimulus.  相似文献   

8.

Background

Animals'' ability for cross-modal recognition has recently received much interest. Captive or domestic animals seem able to perceive cues of human attention and appear to have a multisensory perception of humans.

Methodology/Principal Findings

Here, we used a task where horses have to remain immobile under a vocal order to test whether they are sensitive to the attentional state of the experimenter, but also whether they behave and respond differently to the familiar order when tested by a familiar or an unknown person. Horses'' response varied according to the person''s attentional state when the order was given by an unknown person: obedience levels were higher when the person giving the order was looking at the horse than when he was not attentive. More interesting is the finding that whatever the condition, horses monitored much more and for longer times the unknown person, as if they were surprised to hear the familiar order given by an unknown voice.

Conclusion/Significance

These results suggest that recognition of humans may lie in a global, integrated, multisensory representation of specific individuals, that includes visual and vocal identity, but also expectations on the individual''s behaviour in a familiar situation.  相似文献   

9.
In addition to impairments in social communication and the presence of restricted interests and repetitive behaviors, deficits in sensory processing are now recognized as a core symptom in autism spectrum disorder (ASD). Our ability to perceive and interact with the external world is rooted in sensory processing. For example, listening to a conversation entails processing the auditory cues coming from the speaker (speech content, prosody, syntax) as well as the associated visual information (facial expressions, gestures). Collectively, the “integration” of these multisensory (i.e., combined audiovisual) pieces of information results in better comprehension. Such multisensory integration has been shown to be strongly dependent upon the temporal relationship of the paired stimuli. Thus, stimuli that occur in close temporal proximity are highly likely to result in behavioral and perceptual benefits – gains believed to be reflective of the perceptual system''s judgment of the likelihood that these two stimuli came from the same source. Changes in this temporal integration are expected to strongly alter perceptual processes, and are likely to diminish the ability to accurately perceive and interact with our world. Here, a battery of tasks designed to characterize various aspects of sensory and multisensory temporal processing in children with ASD is described. In addition to its utility in autism, this battery has great potential for characterizing changes in sensory function in other clinical populations, as well as being used to examine changes in these processes across the lifespan.  相似文献   

10.
An increasing number of neuroscience papers capitalize on the assumption published in this journal that visual speech would be typically 150 ms ahead of auditory speech. It happens that the estimation of audiovisual asynchrony in the reference paper is valid only in very specific cases, for isolated consonant-vowel syllables or at the beginning of a speech utterance, in what we call “preparatory gestures”. However, when syllables are chained in sequences, as they are typically in most parts of a natural speech utterance, asynchrony should be defined in a different way. This is what we call “comodulatory gestures” providing auditory and visual events more or less in synchrony. We provide audiovisual data on sequences of plosive-vowel syllables (pa, ta, ka, ba, da, ga, ma, na) showing that audiovisual synchrony is actually rather precise, varying between 20 ms audio lead and 70 ms audio lag. We show how more complex speech material should result in a range typically varying between 40 ms audio lead and 200 ms audio lag, and we discuss how this natural coordination is reflected in the so-called temporal integration window for audiovisual speech perception. Finally we present a toy model of auditory and audiovisual predictive coding, showing that visual lead is actually not necessary for visual prediction.  相似文献   

11.
Currently debate exists relating to the interplay between multisensory processes and bottom-up and top-down influences. However, few studies have looked at neural responses to newly paired audiovisual stimuli that differ in their prescribed relevance. For such newly associated audiovisual stimuli, optimal facilitation of motor actions was observed only when both components of the audiovisual stimuli were targets. Relevant auditory stimuli were found to significantly increase the amplitudes of the event-related potentials at the occipital pole during the first 100 ms post-stimulus onset, though this early integration was not predictive of multisensory facilitation. Activity related to multisensory behavioral facilitation was observed approximately 166 ms post-stimulus, at left central and occipital sites. Furthermore, optimal multisensory facilitation was found to be associated with a latency shift of induced oscillations in the beta range (14–30 Hz) at right hemisphere parietal scalp regions. These findings demonstrate the importance of stimulus relevance to multisensory processing by providing the first evidence that the neural processes underlying multisensory integration are modulated by the relevance of the stimuli being combined. We also provide evidence that such facilitation may be mediated by changes in neural synchronization in occipital and centro-parietal neural populations at early and late stages of neural processing that coincided with stimulus selection, and the preparation and initiation of motor action.  相似文献   

12.
Social animals learn to perceive their social environment, and their social skills and preferences are thought to emerge from greater exposure to and hence familiarity with some social signals rather than others. Familiarity appears to be tightly linked to multisensory integration. The ability to differentiate and categorize familiar and unfamiliar individuals and to build a multisensory representation of known individuals emerges from successive social interactions, in particular with adult, experienced models. In different species, adults have been shown to shape the social behavior of young by promoting selective attention to multisensory cues. The question of what representation of known conspecifics adult-deprived animals may build therefore arises. Here we show that starlings raised with no experience with adults fail to develop a multisensory representation of familiar and unfamiliar starlings. Electrophysiological recordings of neuronal activity throughout the primary auditory area of these birds, while they were exposed to audio-only or audiovisual familiar and unfamiliar cues, showed that visual stimuli did, as in wild-caught starlings, modulate auditory responses but that, unlike what was observed in wild-caught birds, this modulation was not influenced by familiarity. Thus, adult-deprived starlings seem to fail to discriminate between familiar and unfamiliar individuals. This suggests that adults may shape multisensory representation of known individuals in the brain, possibly by focusing the young's attention on relevant, multisensory cues. Multisensory stimulation by experienced, adult models may thus be ubiquitously important for the development of social skills (and of the neural properties underlying such skills) in a variety of species.  相似文献   

13.
Seitz AR  Kim R  Shams L 《Current biology : CB》2006,16(14):1422-1427
Numerous studies show that practice can result in performance improvements on low-level visual perceptual tasks [1-5]. However, such learning is characteristically difficult and slow, requiring many days of training [6-8]. Here, we show that a multisensory audiovisual training procedure facilitates visual learning and results in significantly faster learning than unisensory visual training. We trained one group of subjects with an audiovisual motion-detection task and a second group with a visual motion-detection task, and compared performance on trials containing only visual signals across ten days of training. Whereas observers in both groups showed improvements of visual sensitivity with training, subjects trained with multisensory stimuli showed significantly more learning both within and across training sessions. These benefits of multisensory training are particularly surprising given that the learning of visual motion stimuli is generally thought to be mediated by low-level visual brain areas [6, 9, 10]. Although crossmodal interactions are ubiquitous in human perceptual processing [11-13], the contribution of crossmodal information to perceptual learning has not been studied previously. Our results show that multisensory interactions can be exploited to yield more efficient learning of sensory information and suggest that multisensory training programs would be most effective for the acquisition of new skills.  相似文献   

14.
Visual motion information from dynamic environments is important in multisensory temporal perception. However, it is unclear how visual motion information influences the integration of multisensory temporal perceptions. We investigated whether visual apparent motion affects audiovisual temporal perception. Visual apparent motion is a phenomenon in which two flashes presented in sequence in different positions are perceived as continuous motion. Across three experiments, participants performed temporal order judgment (TOJ) tasks. Experiment 1 was a TOJ task conducted in order to assess audiovisual simultaneity during perception of apparent motion. The results showed that the point of subjective simultaneity (PSS) was shifted toward a sound-lead stimulus, and the just noticeable difference (JND) was reduced compared with a normal TOJ task with a single flash. This indicates that visual apparent motion affects audiovisual simultaneity and improves temporal discrimination in audiovisual processing. Experiment 2 was a TOJ task conducted in order to remove the influence of the amount of flash stimulation from Experiment 1. The PSS and JND during perception of apparent motion were almost identical to those in Experiment 1, but differed from those for successive perception when long temporal intervals were included between two flashes without motion. This showed that the result obtained under the apparent motion condition was unaffected by the amount of flash stimulation. Because apparent motion was produced by a constant interval between two flashes, the results may be accounted for by specific prediction. In Experiment 3, we eliminated the influence of prediction by randomizing the intervals between the two flashes. However, the PSS and JND did not differ from those in Experiment 1. It became clear that the results obtained for the perception of visual apparent motion were not attributable to prediction. Our findings suggest that visual apparent motion changes temporal simultaneity perception and improves temporal discrimination in audiovisual processing.  相似文献   

15.

Background

Synesthesia is a condition in which the stimulation of one sense elicits an additional experience, often in a different (i.e., unstimulated) sense. Although only a small proportion of the population is synesthetic, there is growing evidence to suggest that neurocognitively-normal individuals also experience some form of synesthetic association between the stimuli presented to different sensory modalities (i.e., between auditory pitch and visual size, where lower frequency tones are associated with large objects and higher frequency tones with small objects). While previous research has highlighted crossmodal interactions between synesthetically corresponding dimensions, the possible role of synesthetic associations in multisensory integration has not been considered previously.

Methodology

Here we investigate the effects of synesthetic associations by presenting pairs of asynchronous or spatially discrepant visual and auditory stimuli that were either synesthetically matched or mismatched. In a series of three psychophysical experiments, participants reported the relative temporal order of presentation or the relative spatial locations of the two stimuli.

Principal Findings

The reliability of non-synesthetic participants'' estimates of both audiovisual temporal asynchrony and spatial discrepancy were lower for pairs of synesthetically matched as compared to synesthetically mismatched audiovisual stimuli.

Conclusions

Recent studies of multisensory integration have shown that the reduced reliability of perceptual estimates regarding intersensory conflicts constitutes the marker of a stronger coupling between the unisensory signals. Our results therefore indicate a stronger coupling of synesthetically matched vs. mismatched stimuli and provide the first psychophysical evidence that synesthetic congruency can promote multisensory integration. Synesthetic crossmodal correspondences therefore appear to play a crucial (if unacknowledged) role in the multisensory integration of auditory and visual information.  相似文献   

16.
Visual search is markedly improved when a target color change is synchronized with a spatially non-informative auditory signal. This "pip and pop" effect is an automatic process as even a distractor captures attention when accompanied by a tone. Previous studies investigating visual attention have indicated that automatic capture is susceptible to the size of the attentional window. The present study investigated whether the pip and pop effect is modulated by the extent to which participants divide their attention across the visual field We show that participants were better in detecting a synchronized audiovisual event when they divided their attention across the visual field relative to a condition in which they focused their attention. We argue that audiovisual capture is reduced under focused conditions relative to distributed settings.  相似文献   

17.
It has traditionally been assumed that cochlear implant users de facto perform atypically in audiovisual tasks. However, a recent study that combined an auditory task with visual distractors suggests that only those cochlear implant users that are not proficient at recognizing speech sounds might show abnormal audiovisual interactions. The present study aims at reinforcing this notion by investigating the audiovisual segregation abilities of cochlear implant users in a visual task with auditory distractors. Speechreading was assessed in two groups of cochlear implant users (proficient and non-proficient at sound recognition), as well as in normal controls. A visual speech recognition task (i.e. speechreading) was administered either in silence or in combination with three types of auditory distractors: i) noise ii) reverse speech sound and iii) non-altered speech sound. Cochlear implant users proficient at speech recognition performed like normal controls in all conditions, whereas non-proficient users showed significantly different audiovisual segregation patterns in both speech conditions. These results confirm that normal-like audiovisual segregation is possible in highly skilled cochlear implant users and, consequently, that proficient and non-proficient CI users cannot be lumped into a single group. This important feature must be taken into account in further studies of audiovisual interactions in cochlear implant users.  相似文献   

18.
A combination of signals across modalities can facilitate sensory perception. The audiovisual facilitative effect strongly depends on the features of the stimulus. Here, we investigated how sound frequency, which is one of basic features of an auditory signal, modulates audiovisual integration. In this study, the task of the participant was to respond to a visual target stimulus by pressing a key while ignoring auditory stimuli, comprising of tones of different frequencies (0.5, 1, 2.5 and 5 kHz). A significant facilitation of reaction times was obtained following audiovisual stimulation, irrespective of whether the task-irrelevant sounds were low or high frequency. Using event-related potential (ERP), audiovisual integration was found over the occipital area for 0.5 kHz auditory stimuli from 190–210 ms, for 1 kHz stimuli from 170–200 ms, for 2.5 kHz stimuli from 140–200 ms, 5 kHz stimuli from 100–200 ms. These findings suggest that a higher frequency sound signal paired with visual stimuli might be early processed or integrated despite the auditory stimuli being task-irrelevant information. Furthermore, audiovisual integration in late latency (300–340 ms) ERPs with fronto-central topography was found for auditory stimuli of lower frequencies (0.5, 1 and 2.5 kHz). Our results confirmed that audiovisual integration is affected by the frequency of an auditory stimulus. Taken together, the neurophysiological results provide unique insight into how the brain processes a multisensory visual signal and auditory stimuli of different frequencies.  相似文献   

19.
Bishop CW  Miller LM 《PloS one》2011,6(8):e24016
Speech is the most important form of human communication but ambient sounds and competing talkers often degrade its acoustics. Fortunately the brain can use visual information, especially its highly precise spatial information, to improve speech comprehension in noisy environments. Previous studies have demonstrated that audiovisual integration depends strongly on spatiotemporal factors. However, some integrative phenomena such as McGurk interference persist even with gross spatial disparities, suggesting that spatial alignment is not necessary for robust integration of audiovisual place-of-articulation cues. It is therefore unclear how speech-cues interact with audiovisual spatial integration mechanisms. Here, we combine two well established psychophysical phenomena, the McGurk effect and the ventriloquist's illusion, to explore this dependency. Our results demonstrate that conflicting spatial cues may not interfere with audiovisual integration of speech, but conflicting speech-cues can impede integration in space. This suggests a direct but asymmetrical influence between ventral 'what' and dorsal 'where' pathways.  相似文献   

20.
Given that multiple senses are often stimulated at the same time, perceptual awareness is most likely to take place in multisensory situations. However, theories of awareness are based on studies and models established for a single sense (mostly vision). Here, we consider the methodological and theoretical challenges raised by taking a multisensory perspective on perceptual awareness. First, we consider how well tasks designed to study unisensory awareness perform when used in multisensory settings, stressing that studies using binocular rivalry, bistable figure perception, continuous flash suppression, the attentional blink, repetition blindness and backward masking can demonstrate multisensory influences on unisensory awareness, but fall short of tackling multisensory awareness directly. Studies interested in the latter phenomenon rely on a method of subjective contrast and can, at best, delineate conditions under which individuals report experiencing a multisensory object or two unisensory objects. As there is not a perfect match between these conditions and those in which multisensory integration and binding occur, the link between awareness and binding advocated for visual information processing needs to be revised for multisensory cases. These challenges point at the need to question the very idea of multisensory awareness.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号