首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper focuses on what electrical and magnetic recordings of human brain activity reveal about spoken language understanding. Based on the high temporal resolution of these recordings, a fine-grained temporal profile of different aspects of spoken language comprehension can be obtained. Crucial aspects of speech comprehension are lexical access, selection and semantic integration. Results show that for words spoken in context, there is no 'magic moment' when lexical selection ends and semantic integration begins. Irrespective of whether words have early or late recognition points, semantic integration processing is initiated before words can be identified on the basis of the acoustic information alone. Moreover, for one particular event-related brain potential (ERP) component (the N400), equivalent impact of sentence- and discourse-semantic contexts is observed. This indicates that in comprehension, a spoken word is immediately evaluated relative to the widest interpretive domain available. In addition, this happens very quickly. Findings are discussed that show that often an unfolding word can be mapped onto discourse-level representations well before the end of the word. Overall, the time course of the ERP effects is compatible with the view that the different information types (lexical, syntactic, phonological, pragmatic) are processed in parallel and influence the interpretation process incrementally, that is as soon as the relevant pieces of information are available. This is referred to as the immediacy principle.  相似文献   

2.
In a natural setting, speech is often accompanied by gestures. As language, speech-accompanying iconic gestures to some extent convey semantic information. However, if comprehension of the information contained in both the auditory and visual modality depends on same or different brain-networks is quite unknown. In this fMRI study, we aimed at identifying the cortical areas engaged in supramodal processing of semantic information. BOLD changes were recorded in 18 healthy right-handed male subjects watching video clips showing an actor who either performed speech (S, acoustic) or gestures (G, visual) in more (+) or less (−) meaningful varieties. In the experimental conditions familiar speech or isolated iconic gestures were presented; during the visual control condition the volunteers watched meaningless gestures (G−), while during the acoustic control condition a foreign language was presented (S−). The conjunction of the visual and acoustic semantic processing revealed activations extending from the left inferior frontal gyrus to the precentral gyrus, and included bilateral posterior temporal regions. We conclude that proclaiming this frontotemporal network the brain''s core language system is to take too narrow a view. Our results rather indicate that these regions constitute a supramodal semantic processing network.  相似文献   

3.
Relationship between song characters and morphology in New World pigeons   总被引:1,自引:0,他引:1  
We studied the pattern of variation in song characters among 16 New World pigeon species belonging to different taxonomic groups defined by morphological characters. Structural, temporal and frequency characters of the song were analysed. Principal components analyses showed that species belonging to the same taxonomic group were also grouped together by their song characters. In addition, individuals were correctly assigned into taxonomic groups by discriminant function analyses in more than 87.8% of cases. These analyses also showed that more than 87.5% of the individuals could be correctly classified by species when all song characters were included. Correct classification of individuals by species and taxonomic groups dropped when character types were analysed separately, thus showing that structural, as well as temporal and frequency characters are fundamental to define species- and group-specific identities of New World pigeon's songs. Correspondence between patterns of vocal and morphological variation found in this study can be a consequence of evolutionary changes in morphology affecting song production, as for example body size changes that constrain the syrinx to produce certain acoustic frequencies.  相似文献   

4.
As we speak, we use not only the arbitrary form–meaning mappings of the speech channel but also motivated form–meaning correspondences, i.e. iconic gestures that accompany speech (e.g. inverted V-shaped hand wiggling across gesture space to demonstrate walking). This article reviews what we know about processing of semantic information from speech and iconic gestures in spoken languages during comprehension of such composite utterances. Several studies have shown that comprehension of iconic gestures involves brain activations known to be involved in semantic processing of speech: i.e. modulation of the electrophysiological recording component N400, which is sensitive to the ease of semantic integration of a word to previous context, and recruitment of the left-lateralized frontal–posterior temporal network (left inferior frontal gyrus (IFG), medial temporal gyrus (MTG) and superior temporal gyrus/sulcus (STG/S)). Furthermore, we integrate the information coming from both channels recruiting brain areas such as left IFG, posterior superior temporal sulcus (STS)/MTG and even motor cortex. Finally, this integration is flexible: the temporal synchrony between the iconic gesture and the speech segment, as well as the perceived communicative intent of the speaker, modulate the integration process. Whether these findings are special to gestures or are shared with actions or other visual accompaniments to speech (e.g. lips) or other visual symbols such as pictures are discussed, as well as the implications for a multimodal view of language.  相似文献   

5.
In reverberant rooms with multiple-people talking, spatial separation between speech sources improves recognition of attended speech, even though both the head-shadowing and interaural-interaction unmasking cues are limited by numerous reflections. It is the perceptual integration between the direct wave and its reflections that bridges the direct-reflection temporal gaps and results in the spatial unmasking under reverberant conditions. This study further investigated (1) the temporal dynamic of the direct-reflection-integration-based spatial unmasking as a function of the reflection delay, and (2) whether this temporal dynamic is correlated with the listeners’ auditory ability to temporally retain raw acoustic signals (i.e., the fast decaying primitive auditory memory, PAM). The results showed that recognition of the target speech against the speech-masker background is a descending exponential function of the delay of the simulated target reflection. In addition, the temporal extent of PAM is frequency dependent and markedly longer than that for perceptual fusion. More importantly, the temporal dynamic of the speech-recognition function is significantly correlated with the temporal extent of the PAM of low-frequency raw signals. Thus, we propose that a chain process, which links the earlier-stage PAM with the later-stage correlation computation, perceptual integration, and attention facilitation, plays a role in spatially unmasking target speech under reverberant conditions.  相似文献   

6.
In two experiments, we explored attention deployment during the reading of Chinese words using a probe detection task. In both experiments, Chinese readers saw four simplified Chinese characters briefly, and then a probe was presented at one of the character positions. The four characters constituted either one word or two words of two characters each. Reaction time was shorter when the probe was at the character 2 position than the character 3 position in the two-word condition, but not in the one-word condition. In Experiment 2, there were more trials and the materials were more carefully controlled, and the results replicated that of Experiment 1. These results suggest that word boundary information affects attentional deployment in Chinese reading.  相似文献   

7.
We systematically determined which spectrotemporal modulations in speech are necessary for comprehension by human listeners. Speech comprehension has been shown to be robust to spectral and temporal degradations, but the specific relevance of particular degradations is arguable due to the complexity of the joint spectral and temporal information in the speech signal. We applied a novel modulation filtering technique to recorded sentences to restrict acoustic information quantitatively and to obtain a joint spectrotemporal modulation transfer function for speech comprehension, the speech MTF. For American English, the speech MTF showed the criticality of low modulation frequencies in both time and frequency. Comprehension was significantly impaired when temporal modulations <12 Hz or spectral modulations <4 cycles/kHz were removed. More specifically, the MTF was bandpass in temporal modulations and low-pass in spectral modulations: temporal modulations from 1 to 7 Hz and spectral modulations <1 cycles/kHz were the most important. We evaluated the importance of spectrotemporal modulations for vocal gender identification and found a different region of interest: removing spectral modulations between 3 and 7 cycles/kHz significantly increases gender misidentifications of female speakers. The determination of the speech MTF furnishes an additional method for producing speech signals with reduced bandwidth but high intelligibility. Such compression could be used for audio applications such as file compression or noise removal and for clinical applications such as signal processing for cochlear implants.  相似文献   

8.
Shen D  Alain C 《PloS one》2012,7(4):e36031
Attentional blink (AB) describes a phenomenon whereby correct identification of a first target impairs the processing of a second target (i.e., probe) nearby in time. Evidence suggests that explicit attention orienting in the time domain can attenuate the AB. Here, we used scalp-recorded, event-related potentials to examine whether auditory AB is also sensitive to implicit temporal attention orienting. Expectations were set up implicitly by varying the probability (i.e., 80% or 20%) that the probe would occur at the +2 or +8 position following target presentation. Participants showed a significant AB, which was reduced with the increased probe probability at the +2 position. The probe probability effect was paralleled by an increase in P3b amplitude elicited by the probe. The results suggest that implicit temporal attention orienting can facilitate short-term consolidation of the probe and attenuate auditory AB.  相似文献   

9.
Cognitive functions that rely on accurate sequencing of events, such as action planning and execution, verbal and nonverbal communication, and social interaction rely on well-tuned coding of temporal event-structure. Visual temporal event-structure coding was tested in 17 high-functioning adolescents and adults with autism spectrum disorder (ASD) and mental- and chronological-age matched typically-developing (TD) individuals using a perceptual simultaneity paradigm. Visual simultaneity thresholds were lower in individuals with ASD compared to TD individuals, suggesting that autism may be characterised by increased parsing of temporal event-structure, with a decreased capability for integration over time. Lower perceptual simultaneity thresholds in ASD were also related to increased developmental communication difficulties. These results are linked to detail-focussed and local processing bias.  相似文献   

10.
The study of the production of co-speech gestures (CSGs), i.e., meaningful hand movements that often accompany speech during everyday discourse, provides an important opportunity to investigate the integration of language, action, and memory because of the semantic overlap between gesture movements and speech content. Behavioral studies of CSGs and speech suggest that they have a common base in memory and predict that overt production of both speech and CSGs would be preceded by neural activity related to memory processes. However, to date the neural correlates and timing of CSG production are still largely unknown. In the current study, we addressed these questions with magnetoencephalography and a semantic association paradigm in which participants overtly produced speech or gesture responses that were either meaningfully related to a stimulus or not. Using spectral and beamforming analyses to investigate the neural activity preceding the responses, we found a desynchronization in the beta band (15–25 Hz), which originated 900 ms prior to the onset of speech and was localized to motor and somatosensory regions in the cortex and cerebellum, as well as right inferior frontal gyrus. Beta desynchronization is often seen as an indicator of motor processing and thus reflects motor activity related to the hand movements that gestures add to speech. Furthermore, our results show oscillations in the high gamma band (50–90 Hz), which originated 400 ms prior to speech onset and were localized to the left medial temporal lobe. High gamma oscillations have previously been found to be involved in memory processes and we thus interpret them to be related to contextual association of semantic information in memory. The results of our study show that high gamma oscillations in medial temporal cortex play an important role in the binding of information in human memory during speech and CSG production.  相似文献   

11.
This study aimed to characterize the linguistic interference that occurs during speech-in-speech comprehension by combining offline and online measures, which included an intelligibility task (at a −5 dB Signal-to-Noise Ratio) and 2 lexical decision tasks (at a −5 dB and 0 dB SNR) that were performed with French spoken target words. In these 3 experiments we always compared the masking effects of speech backgrounds (i.e., 4-talker babble) that were produced in the same language as the target language (i.e., French) or in unknown foreign languages (i.e., Irish and Italian) to the masking effects of corresponding non-speech backgrounds (i.e., speech-derived fluctuating noise). The fluctuating noise contained similar spectro-temporal information as babble but lacked linguistic information. At −5 dB SNR, both tasks revealed significantly divergent results between the unknown languages (i.e., Irish and Italian) with Italian and French hindering French target word identification to a similar extent, whereas Irish led to significantly better performances on these tasks. By comparing the performances obtained with speech and fluctuating noise backgrounds, we were able to evaluate the effect of each language. The intelligibility task showed a significant difference between babble and fluctuating noise for French, Irish and Italian, suggesting acoustic and linguistic effects for each language. However, the lexical decision task, which reduces the effect of post-lexical interference, appeared to be more accurate, as it only revealed a linguistic effect for French. Thus, although French and Italian had equivalent masking effects on French word identification, the nature of their interference was different. This finding suggests that the differences observed between the masking effects of Italian and Irish can be explained at an acoustic level but not at a linguistic level.  相似文献   

12.
Effects of background speech on reading were examined by playing aloud different types of background speech, while participants read long, syntactically complex and less complex sentences embedded in text. Readers’ eye movement patterns were used to study online sentence comprehension. Effects of background speech were primarily seen in rereading time. In Experiment 1, foreign-language background speech did not disrupt sentence processing. Experiment 2 demonstrated robust disruption in reading as a result of semantically and syntactically anomalous scrambled background speech preserving normal sentence-like intonation. Scrambled speech that was constructed from the text to-be read did not disrupt reading more than scrambled speech constructed from a different, semantically unrelated text. Experiment 3 showed that scrambled speech exacerbated the syntactic complexity effect more than coherent background speech, which also interfered with reading. Experiment 4 demonstrated that both semantically and syntactically anomalous speech produced no more disruption in reading than semantically anomalous but syntactically correct background speech. The pattern of results is best explained by a semantic account that stresses the importance of similarity in semantic processing, but not similarity in semantic content, between the reading task and background speech.  相似文献   

13.
Many behaviourally relevant sensory events such as motion stimuli and speech have an intrinsic spatio-temporal structure. This will engage intentional and most likely unintentional (automatic) prediction mechanisms enhancing the perception of upcoming stimuli in the event stream. Here we sought to probe the anticipatory processes that are automatically driven by rhythmic input streams in terms of their spatial and temporal components. To this end, we employed an apparent visual motion paradigm testing the effects of pre-target motion on lateralized visual target discrimination. The motion stimuli either moved towards or away from peripheral target positions (valid vs. invalid spatial motion cueing) at a rhythmic or arrhythmic pace (valid vs. invalid temporal motion cueing). Crucially, we emphasized automatic motion-induced anticipatory processes by rendering the motion stimuli non-predictive of upcoming target position (by design) and task-irrelevant (by instruction), and by creating instead endogenous (orthogonal) expectations using symbolic cueing. Our data revealed that the apparent motion cues automatically engaged both spatial and temporal anticipatory processes, but that these processes were dissociated. We further found evidence for lateralisation of anticipatory temporal but not spatial processes. This indicates that distinct mechanisms may drive automatic spatial and temporal extrapolation of upcoming events from rhythmic event streams. This contrasts with previous findings that instead suggest an interaction between spatial and temporal attention processes when endogenously driven. Our results further highlight the need for isolating intentional from unintentional processes for better understanding the various anticipatory mechanisms engaged in processing behaviourally relevant stimuli with predictable spatio-temporal structure such as motion and speech.  相似文献   

14.
15.
Using the event-related optical signal (EROS) technique, this study investigated the dynamics of semantic brain activation during sentence comprehension. Participants read sentences constituent-by-constituent and made a semantic judgment at the end of each sentence. The EROSs were recorded simultaneously with ERPs and time-locked to expected or unexpected sentence-final target words. The unexpected words evoked a larger N400 and a late positivity than the expected ones. Critically, the EROS results revealed activations first in the left posterior middle temporal gyrus (LpMTG) between 128 and 192 ms, then in the left anterior inferior frontal gyrus (LaIFG), the left middle frontal gyrus (LMFG), and the LpMTG in the N400 time window, and finally in the left posterior inferior frontal gyrus (LpIFG) between 832 and 864 ms. Also, expected words elicited greater activation than unexpected words in the left anterior temporal lobe (LATL) between 192 and 256 ms. These results suggest that the early lexical-semantic retrieval reflected by the LpMTG activation is followed by two different semantic integration processes: a relatively rapid and transient integration in the LATL and a relatively slow but enduring integration in the LaIFG/LMFG and the LpMTG. The late activation in the LpIFG, however, may reflect cognitive control.  相似文献   

16.
17.
Speech perception often benefits from vision of the speaker's lip movements when they are available. One potential mechanism underlying this reported gain in perception arising from audio-visual integration is on-line prediction. In this study we address whether the preceding speech context in a single modality can improve audiovisual processing and whether this improvement is based on on-line information-transfer across sensory modalities. In the experiments presented here, during each trial, a speech fragment (context) presented in a single sensory modality (voice or lips) was immediately continued by an audiovisual target fragment. Participants made speeded judgments about whether voice and lips were in agreement in the target fragment. The leading single sensory context and the subsequent audiovisual target fragment could be continuous in either one modality only, both (context in one modality continues into both modalities in the target fragment) or neither modalities (i.e., discontinuous). The results showed quicker audiovisual matching responses when context was continuous with the target within either the visual or auditory channel (Experiment 1). Critically, prior visual context also provided an advantage when it was cross-modally continuous (with the auditory channel in the target), but auditory to visual cross-modal continuity resulted in no advantage (Experiment 2). This suggests that visual speech information can provide an on-line benefit for processing the upcoming auditory input through the use of predictive mechanisms. We hypothesize that this benefit is expressed at an early level of speech analysis.  相似文献   

18.
Speech processing inherently relies on the perception of specific, rapidly changing spectral and temporal acoustic features. Advanced acoustic perception is also integral to musical expertise, and accordingly several studies have demonstrated a significant relationship between musical training and superior processing of various aspects of speech. Speech and music appear to overlap in spectral and temporal features; however, it remains unclear which of these acoustic features, crucial for speech processing, are most closely associated with musical training. The present study examined the perceptual acuity of musicians to the acoustic components of speech necessary for intra-phonemic discrimination of synthetic syllables. We compared musicians and non-musicians on discrimination thresholds of three synthetic speech syllable continua that varied in their spectral and temporal discrimination demands, specifically voice onset time (VOT) and amplitude envelope cues in the temporal domain. Musicians demonstrated superior discrimination only for syllables that required resolution of temporal cues. Furthermore, performance on the temporal syllable continua positively correlated with the length and intensity of musical training. These findings support one potential mechanism by which musical training may selectively enhance speech perception, namely by reinforcing temporal acuity and/or perception of amplitude rise time, and implications for the translation of musical training to long-term linguistic abilities.  相似文献   

19.
Auditory information is processed in a fine-to-crude hierarchical scheme, from low-level acoustic information to high-level abstract representations, such as phonological labels. We now ask whether fine acoustic information, which is not retained at high levels, can still be used to extract speech from noise. Previous theories suggested either full availability of low-level information or availability that is limited by task difficulty. We propose a third alternative, based on the Reverse Hierarchy Theory (RHT), originally derived to describe the relations between the processing hierarchy and visual perception. RHT asserts that only the higher levels of the hierarchy are immediately available for perception. Direct access to low-level information requires specific conditions, and can be achieved only at the cost of concurrent comprehension. We tested the predictions of these three views in a series of experiments in which we measured the benefits from utilizing low-level binaural information for speech perception, and compared it to that predicted from a model of the early auditory system. Only auditory RHT could account for the full pattern of the results, suggesting that similar defaults and tradeoffs underlie the relations between hierarchical processing and perception in the visual and auditory modalities.  相似文献   

20.
Auditory information is processed in a fine-to-crude hierarchical scheme, from low-level acoustic information to high-level abstract representations, such as phonological labels. We now ask whether fine acoustic information, which is not retained at high levels, can still be used to extract speech from noise. Previous theories suggested either full availability of low-level information or availability that is limited by task difficulty. We propose a third alternative, based on the Reverse Hierarchy Theory (RHT), originally derived to describe the relations between the processing hierarchy and visual perception. RHT asserts that only the higher levels of the hierarchy are immediately available for perception. Direct access to low-level information requires specific conditions, and can be achieved only at the cost of concurrent comprehension. We tested the predictions of these three views in a series of experiments in which we measured the benefits from utilizing low-level binaural information for speech perception, and compared it to that predicted from a model of the early auditory system. Only auditory RHT could account for the full pattern of the results, suggesting that similar defaults and tradeoffs underlie the relations between hierarchical processing and perception in the visual and auditory modalities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号