首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The perception and production of biological movements is characterized by the 1/3 power law, a relation linking the curvature and the velocity of an intended action. In particular, motions are perceived and reproduced distorted when their kinematics deviate from this biological law. Whereas most studies dealing with this perceptual-motor relation focused on visual or kinaesthetic modalities in a unimodal context, in this paper we show that auditory dynamics strikingly biases visuomotor processes. Biologically consistent or inconsistent circular visual motions were used in combination with circular or elliptical auditory motions. Auditory motions were synthesized friction sounds mimicking those produced by the friction of the pen on a paper when someone is drawing. Sounds were presented diotically and the auditory motion velocity was evoked through the friction sound timbre variations without any spatial cues. Remarkably, when subjects were asked to reproduce circular visual motion while listening to sounds that evoked elliptical kinematics without seeing their hand, they drew elliptical shapes. Moreover, distortion induced by inconsistent elliptical kinematics in both visual and auditory modalities added up linearly. These results bring to light the substantial role of auditory dynamics in the visuo-motor coupling in a multisensory context.  相似文献   

2.
In humans, emotions from music serve important communicative roles. Despite a growing interest in the neural basis of music perception, action and emotion, the majority of previous studies in this area have focused on the auditory aspects of music performances. Here we investigate how the brain processes the emotions elicited by audiovisual music performances. We used event-related functional magnetic resonance imaging, and in Experiment 1 we defined the areas responding to audiovisual (musician's movements with music), visual (musician's movements only), and auditory emotional (music only) displays. Subsequently a region of interest analysis was performed to examine if any of the areas detected in Experiment 1 showed greater activation for emotionally mismatching performances (combining the musician's movements with mismatching emotional sound) than for emotionally matching music performances (combining the musician's movements with matching emotional sound) as presented in Experiment 2 to the same participants. The insula and the left thalamus were found to respond consistently to visual, auditory and audiovisual emotional information and to have increased activation for emotionally mismatching displays in comparison with emotionally matching displays. In contrast, the right thalamus was found to respond to audiovisual emotional displays and to have similar activation for emotionally matching and mismatching displays. These results suggest that the insula and left thalamus have an active role in detecting emotional correspondence between auditory and visual information during music performances, whereas the right thalamus has a different role.  相似文献   

3.
Events encoded in separate sensory modalities, such as audition and vision, can seem to be synchronous across a relatively broad range of physical timing differences. This may suggest that the precision of audio-visual timing judgments is inherently poor. Here we show that this is not necessarily true. We contrast timing sensitivity for isolated streams of audio and visual speech, and for streams of audio and visual speech accompanied by additional, temporally offset, visual speech streams. We find that the precision with which synchronous streams of audio and visual speech are identified is enhanced by the presence of additional streams of asynchronous visual speech. Our data suggest that timing perception is shaped by selective grouping processes, which can result in enhanced precision in temporally cluttered environments. The imprecision suggested by previous studies might therefore be a consequence of examining isolated pairs of audio and visual events. We argue that when an isolated pair of cross-modal events is presented, they tend to group perceptually and to seem synchronous as a consequence. We have revealed greater precision by providing multiple visual signals, possibly allowing a single auditory speech stream to group selectively with the most synchronous visual candidate. The grouping processes we have identified might be important in daily life, such as when we attempt to follow a conversation in a crowded room.  相似文献   

4.
Whereas extensive neuroscientific and behavioral evidence has confirmed a role of auditory-visual integration in representing space [1-6], little is known about the role of auditory-visual integration in object perception. Although recent neuroimaging results suggest integrated auditory-visual object representations [7-11], substantiating behavioral evidence has been lacking. We demonstrated auditory-visual integration in the perception of face gender by using pure tones that are processed in low-level auditory brain areas and that lack the spectral components that characterize human vocalization. When androgynous faces were presented together with pure tones in the male fundamental-speaking-frequency range, faces were more likely to be judged as male, whereas when faces were presented with pure tones in the female fundamental-speaking-frequency range, they were more likely to be judged as female. Importantly, when participants were explicitly asked to attribute gender to these pure tones, their judgments were primarily based on relative pitch and were uncorrelated with the male and female fundamental-speaking-frequency ranges. This perceptual dissociation of absolute-frequency-based crossmodal-integration effects from relative-pitch-based explicit perception of the tones provides evidence for a sensory integration of auditory and visual signals in representing human gender. This integration probably develops because of concurrent neural processing of visual and auditory features of gender.  相似文献   

5.
Watching a speaker''s facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness), one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.  相似文献   

6.
To form a percept of the multisensory world, the brain needs to integrate signals from common sources weighted by their reliabilities and segregate those from independent sources. Previously, we have shown that anterior parietal cortices combine sensory signals into representations that take into account the signals’ causal structure (i.e., common versus independent sources) and their sensory reliabilities as predicted by Bayesian causal inference. The current study asks to what extent and how attentional mechanisms can actively control how sensory signals are combined for perceptual inference. In a pre- and postcueing paradigm, we presented observers with audiovisual signals at variable spatial disparities. Observers were precued to attend to auditory or visual modalities prior to stimulus presentation and postcued to report their perceived auditory or visual location. Combining psychophysics, functional magnetic resonance imaging (fMRI), and Bayesian modelling, we demonstrate that the brain moulds multisensory inference via two distinct mechanisms. Prestimulus attention to vision enhances the reliability and influence of visual inputs on spatial representations in visual and posterior parietal cortices. Poststimulus report determines how parietal cortices flexibly combine sensory estimates into spatial representations consistent with Bayesian causal inference. Our results show that distinct neural mechanisms control how signals are combined for perceptual inference at different levels of the cortical hierarchy.

A combination of psychophysics, computational modelling and fMRI reveals novel insights into how the brain controls the binding of information across the senses, such as the voice and lip movements of a speaker.  相似文献   

7.
Integrating information across sensory domains to construct a unified representation of multi-sensory signals is a fundamental characteristic of perception in ecological contexts. One provocative hypothesis deriving from neurophysiology suggests that there exists early and direct cross-modal phase modulation. We provide evidence, based on magnetoencephalography (MEG) recordings from participants viewing audiovisual movies, that low-frequency neuronal information lies at the basis of the synergistic coordination of information across auditory and visual streams. In particular, the phase of the 2–7 Hz delta and theta band responses carries robust (in single trials) and usable information (for parsing the temporal structure) about stimulus dynamics in both sensory modalities concurrently. These experiments are the first to show in humans that a particular cortical mechanism, delta-theta phase modulation across early sensory areas, plays an important “active” role in continuously tracking naturalistic audio-visual streams, carrying dynamic multi-sensory information, and reflecting cross-sensory interaction in real time.  相似文献   

8.
In natural environments, sensory information is embedded in temporally contiguous streams of events. This is typically the case when seeing and listening to a speaker or when engaged in scene analysis. In such contexts, two mechanisms are needed to single out and build a reliable representation of an event (or object): the temporal parsing of information and the selection of relevant information in the stream. It has previously been shown that rhythmic events naturally build temporal expectations that improve sensory processing at predictable points in time. Here, we asked to which extent temporal regularities can improve the detection and identification of events across sensory modalities. To do so, we used a dynamic visual conjunction search task accompanied by auditory cues synchronized or not with the color change of the target (horizontal or vertical bar). Sounds synchronized with the visual target improved search efficiency for temporal rates below 1.4 Hz but did not affect efficiency above that stimulation rate. Desynchronized auditory cues consistently impaired visual search below 3.3 Hz. Our results are interpreted in the context of the Dynamic Attending Theory: specifically, we suggest that a cognitive operation structures events in time irrespective of the sensory modality of input. Our results further support and specify recent neurophysiological findings by showing strong temporal selectivity for audiovisual integration in the auditory-driven improvement of visual search efficiency.  相似文献   

9.
The ability to integrate information across multiple sensory systems offers several behavioral advantages, from quicker reaction times and more accurate responses to better detection and more robust learning. At the neural level, multisensory integration requires large-scale interactions between different brain regions--the convergence of information from separate sensory modalities, represented by distinct neuronal populations. The interactions between these neuronal populations must be fast and flexible, so that behaviorally relevant signals belonging to the same object or event can be immediately integrated and integration of unrelated signals can be prevented. Looming signals are a particular class of signals that are behaviorally relevant for animals and that occur in both the auditory and visual domain. These signals indicate the rapid approach of objects and provide highly salient warning cues about impending impact. We show here that multisensory integration of auditory and visual looming signals may be mediated by functional interactions between auditory cortex and the superior temporal sulcus, two areas involved in integrating behaviorally relevant auditory-visual signals. Audiovisual looming signals elicited increased gamma-band coherence between these areas, relative to unimodal or receding-motion signals. This suggests that the neocortex uses fast, flexible intercortical interactions to mediate multisensory integration.  相似文献   

10.
Ninety-two chicks were exposed to two different sounds, one of which was paired during training with a conspicuous visual stimulus. Visual stimuli maintained subsequent responses to the auditory stimuli, whereas responses to discriminably different sounds that were not paired with visual stimuli habituated. Auditory discriminations were learned at 1 day of age both by chicks that were previously ‘imprinted’ to the visual stimulus and by those that were not. Only the visually imprinted chicks performed significantly better than random when trained at 2 or 3 days, indicating an early critical period for visual, but not auditory, stimuli. It is suggested that visual stimuli enhance responses to species-typical and individually distinctive vocalizations.  相似文献   

11.
Acoustic sequences such as speech and music are generally perceived as coherent auditory "streams," which can be individually attended to and followed over time. Although the psychophysical stimulus parameters governing this "auditory streaming" are well established, the brain mechanisms underlying the formation of auditory streams remain largely unknown. In particular, an essential feature of the phenomenon, which corresponds to the fact that the segregation of sounds into streams typically takes several seconds to build up, remains unexplained. Here, we show that this and other major features of auditory-stream formation measured in humans using alternating-tone sequences can be quantitatively accounted for based on single-unit responses recorded in the primary auditory cortex (A1) of awake rhesus monkeys listening to the same sound sequences.  相似文献   

12.
The measurement of time is fundamental to the perception of complex, temporally structured acoustic signals such as speech and music, yet the mechanisms of temporal sensitivity in the auditory system remain largely unknown. Recently, temporal feature detectors have been discovered in several vertebrate auditory systems. For example, midbrain neurons in the fish Pollimyrus are activated by specific rhythms contained in the simple sounds they use for communication. This poses the significant challenge of uncovering the neuro-computational mechanisms that underlie temporal feature detection. Here we describe a model network that responds selectively to temporal features of communication sounds, yielding temporal selectivity in output neurons that matches the selectivity functions found in the auditory system of Pollimyrus. The output of the network depends upon the timing of excitatory and inhibitory input and post-inhibitory rebound excitation. Interval tuning is achieved in a behaviorally relevant range (10 to 40 ms) using a biologically constrained model, providing a simple mechanism that is suitable for the neural extraction of the relatively long duration temporal cues (i.e. tens to hundreds of ms) that are important in animal communication and human speech.  相似文献   

13.
Multisensory integration was once thought to be the domain of brain areas high in the cortical hierarchy, with early sensory cortical fields devoted to unisensory processing of inputs from their given set of sensory receptors. More recently, a wealth of evidence documenting visual and somatosensory responses in auditory cortex, even as early as the primary fields, has changed this view of cortical processing. These multisensory inputs may serve to enhance responses to sounds that are accompanied by other sensory cues, effectively making them easier to hear, but may also act more selectively to shape the receptive field properties of auditory cortical neurons to the location or identity of these events. We discuss the new, converging evidence that multiplexing of neural signals may play a key role in informatively encoding and integrating signals in auditory cortex across multiple sensory modalities. We highlight some of the many open research questions that exist about the neural mechanisms that give rise to multisensory integration in auditory cortex, which should be addressed in future experimental and theoretical studies.  相似文献   

14.
Auditory selective attention enables task-relevant auditory events to be enhanced and irrelevant ones suppressed. In the present study we used a frequency tagging paradigm to investigate the effects of attention on auditory steady state responses (ASSR). The ASSR was elicited by simultaneously presenting two different streams of white noise, amplitude modulated at either 16 and 23.5 Hz or 32.5 and 40 Hz. The two different frequencies were presented to each ear and participants were instructed to selectively attend to one ear or the other (confirmed by behavioral evidence). The results revealed that modulation of ASSR by selective attention depended on the modulation frequencies used and whether the activation was contralateral or ipsilateral. Attention enhanced the ASSR for contralateral activation from either ear for 16 Hz and suppressed the ASSR for ipsilateral activation for 16 Hz and 23.5 Hz. For modulation frequencies of 32.5 or 40 Hz attention did not affect the ASSR. We propose that the pattern of enhancement and inhibition may be due to binaural suppressive effects on ipsilateral stimulation and the dominance of contralateral hemisphere during dichotic listening. In addition to the influence of cortical processing asymmetries, these results may also reflect a bias towards inhibitory ipsilateral and excitatory contralateral activation present at the level of inferior colliculus. That the effect of attention was clearest for the lower modulation frequencies suggests that such effects are likely mediated by cortical brain structures or by those in close proximity to cortex.  相似文献   

15.
Although infant speech perception in often studied in isolated modalities, infants'' experience with speech is largely multimodal (i.e., speech sounds they hear are accompanied by articulating faces). Across two experiments, we tested infants’ sensitivity to the relationship between the auditory and visual components of audiovisual speech in their native (English) and non-native (Spanish) language. In Experiment 1, infants’ looking times were measured during a preferential looking task in which they saw two simultaneous visual speech streams articulating a story, one in English and the other in Spanish, while they heard either the English or the Spanish version of the story. In Experiment 2, looking times from another group of infants were measured as they watched single displays of congruent and incongruent combinations of English and Spanish audio and visual speech streams. Findings demonstrated an age-related increase in looking towards the native relative to non-native visual speech stream when accompanied by the corresponding (native) auditory speech. This increase in native language preference did not appear to be driven by a difference in preference for native vs. non-native audiovisual congruence as we observed no difference in looking times at the audiovisual streams in Experiment 2.  相似文献   

16.
既往研究发现听觉感知包括对声音信号的觉察、感觉、注意和知觉等多个认知过程,但依然不清楚大脑如何对不同类型的复杂声音信号(如同种鸣声和其他声音)进行解码和处理,以及在感知不同类型声音信号时大脑活动的动态特征.本研究记录了在随机播放白噪声和洞内鸣叫声音刺激时仙琴蛙Nidirana daunchina的左右端脑、间脑和中脑的...  相似文献   

17.
In our daily lives, auditory stream segregation allows us to differentiate concurrent sound sources and to make sense of the scene we are experiencing. However, a combination of segregation and the concurrent integration of auditory streams is necessary in order to analyze the relationship between streams and thus perceive a coherent auditory scene. The present functional magnetic resonance imaging study investigates the relative role and neural underpinnings of these listening strategies in multi-part musical stimuli. We compare a real human performance of a piano duet and a synthetic stimulus of the same duet in a prioritized integrative attention paradigm that required the simultaneous segregation and integration of auditory streams. In so doing, we manipulate the degree to which the attended part of the duet led either structurally (attend melody vs. attend accompaniment) or temporally (asynchronies vs. no asynchronies between parts), and thus the relative contributions of integration and segregation used to make an assessment of the leader-follower relationship. We show that perceptually the relationship between parts is biased towards the conventional structural hierarchy in western music in which the melody generally dominates (leads) the accompaniment. Moreover, the assessment varies as a function of both cognitive load, as shown through difficulty ratings and the interaction of the temporal and the structural relationship factors. Neurally, we see that the temporal relationship between parts, as one important cue for stream segregation, revealed distinct neural activity in the planum temporale. By contrast, integration used when listening to both the temporally separated performance stimulus and the temporally fused synthetic stimulus resulted in activation of the intraparietal sulcus. These results support the hypothesis that the planum temporale and IPS are key structures underlying the mechanisms of segregation and integration of auditory streams, respectively.  相似文献   

18.
Rapid integration of biologically relevant information is crucial for the survival of an organism. Most prominently, humans should be biased to attend and respond to looming stimuli that signal approaching danger (e.g. predator) and hence require rapid action. This psychophysics study used binocular rivalry to investigate the perceptual advantage of looming (relative to receding) visual signals (i.e. looming bias) and how this bias can be influenced by concurrent auditory looming/receding stimuli and the statistical structure of the auditory and visual signals.Subjects were dichoptically presented with looming/receding visual stimuli that were paired with looming or receding sounds. The visual signals conformed to two different statistical structures: (1) a ‘simple’ random-dot kinematogram showing a starfield and (2) a “naturalistic” visual Shepard stimulus. Likewise, the looming/receding sound was (1) a simple amplitude- and frequency-modulated (AM-FM) tone or (2) a complex Shepard tone. Our results show that the perceptual looming bias (i.e. the increase in dominance times for looming versus receding percepts) is amplified by looming sounds, yet reduced and even converted into a receding bias by receding sounds. Moreover, the influence of looming/receding sounds on the visual looming bias depends on the statistical structure of both the visual and auditory signals. It is enhanced when audiovisual signals are Shepard stimuli.In conclusion, visual perception prioritizes processing of biologically significant looming stimuli especially when paired with looming auditory signals. Critically, these audiovisual interactions are amplified for statistically complex signals that are more naturalistic and known to engage neural processing at multiple levels of the cortical hierarchy.  相似文献   

19.
Evidence from human neuroimaging and animal electrophysiological studies suggests that signals from different sensory modalities interact early in cortical processing, including in primary sensory cortices. The present study aimed to test whether functional near-infrared spectroscopy (fNIRS), an emerging, non-invasive neuroimaging technique, is capable of measuring such multisensory interactions. Specifically, we tested for a modulatory influence of sounds on activity in visual cortex, while varying the temporal synchrony between trains of transient auditory and visual events. Related fMRI studies have consistently reported enhanced activation in response to synchronous compared to asynchronous audiovisual stimulation. Unexpectedly, we found that synchronous sounds significantly reduced the fNIRS response from visual cortex, compared both to asynchronous sounds and to a visual-only baseline. It is possible that this suppressive effect of synchronous sounds reflects the use of an efficacious visual stimulus, chosen for consistency with previous fNIRS studies. Discrepant results may also be explained by differences between studies in how attention was deployed to the auditory and visual modalities. The presence and relative timing of sounds did not significantly affect performance in a simultaneously conducted behavioral task, although the data were suggestive of a positive relationship between the strength of the fNIRS response from visual cortex and the accuracy of visual target detection. Overall, the present findings indicate that fNIRS is capable of measuring multisensory cortical interactions. In multisensory research, fNIRS can offer complementary information to the more established neuroimaging modalities, and may prove advantageous for testing in naturalistic environments and with infant and clinical populations.  相似文献   

20.
Vocalizations have been elucidated in previous songbird studies, whereas less attention has been paid to non-vocal sounds. In the blue-capped cordon-bleu (Uraeginthus cyanocephalus), both sexes perform courtship displays that are accompanied by singing and distinct body movements (i.e. dance). Our previous study revealed that their courtship bobbing includes multiple rapid steps. This behaviour is quite similar to human tap dancing, because it can function as both visual and acoustic signals. To examine the acoustic signal value of such steps, we tested if their high-speed step movements produce non-vocal sounds that have amplitudes similar to vocal sounds. We found that step behaviour affected step sound amplitude. Additionally, the dancing step sounds were substantially louder than feet movement sounds in a non-courtship context, and the amplitude range overlapped with that of song notes. These results support the idea that in addition to song cordon-bleus produce acoustic signals with their feet.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号