首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Effects of background speech on reading were examined by playing aloud different types of background speech, while participants read long, syntactically complex and less complex sentences embedded in text. Readers’ eye movement patterns were used to study online sentence comprehension. Effects of background speech were primarily seen in rereading time. In Experiment 1, foreign-language background speech did not disrupt sentence processing. Experiment 2 demonstrated robust disruption in reading as a result of semantically and syntactically anomalous scrambled background speech preserving normal sentence-like intonation. Scrambled speech that was constructed from the text to-be read did not disrupt reading more than scrambled speech constructed from a different, semantically unrelated text. Experiment 3 showed that scrambled speech exacerbated the syntactic complexity effect more than coherent background speech, which also interfered with reading. Experiment 4 demonstrated that both semantically and syntactically anomalous speech produced no more disruption in reading than semantically anomalous but syntactically correct background speech. The pattern of results is best explained by a semantic account that stresses the importance of similarity in semantic processing, but not similarity in semantic content, between the reading task and background speech.  相似文献   

2.
Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data.  相似文献   

3.
4.
Among topics related to the evolution of language, the evolution of speech is particularly fascinating. Early theorists believed that it was the ability to produce articulate speech that set the stage for the evolution of the «special» speech processing abilities that exist in modern-day humans. Prior to the evolution of speech production, speech processing abilities were presumed not to exist. The data reviewed here support a different view. Two lines of evidence, one from young human infants and the other from infrahuman species, neither of whom can produce articulate speech, show that in the absence of speech production capabilities, the perception of speech sounds is robust and sophisticated. Human infants and non-human animals evidence auditory perceptual categories that conform to those defined by the phonetic categories of language. These findings suggest the possibility that in evolutionary history the ability to perceive rudimentary speech categories preceded the ability to produce articulate speech. This in turn suggests that it may be audition that structured, at least initially, the formation of phonetic categories.  相似文献   

5.
Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG) to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta) and the amplitude of high-frequency (gamma) oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex) attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations.  相似文献   

6.
The purpose of this study was to compare the duration and variability of speech segments of children who stutter with those of children who do not stutter and to identify changes in duration and variability of speech segments due to the effect of utterance length. Eighteen children participated (ranging from 6.3 to 7.9 years of age). The experimental task required the children to repeat a single word in isolation and the same word embedded in a sentence. Durations of speech segments and Coefficients of variation (Cv) were defined to assess temporal parameters of speech. Significant differences were found in the variability of speech segments on the sentence level, but not in duration. The findings supported the assumption that linguistic factors pose direct demands on the speech motor system and that the extra duration of speech segments observed in the speech of stuttering adults may be a kind of compensation strategy.  相似文献   

7.
Various aspects of the mechanisms of speech are studied. One series of studies has concentrated on the perception of speech, the sounds of speech, and word recognition. Various models for speech recognition have been created. Another set of studies has focused on articulation, the pronunciation of words and the sounds of speech. This area has also been explored in considerable detail.  相似文献   

8.

Background

Hearing ability is essential for normal speech development, however the precise mechanisms linking auditory input and the improvement of speaking ability remain poorly understood. Auditory feedback during speech production is believed to play a critical role by providing the nervous system with information about speech outcomes that is used to learn and subsequently fine-tune speech motor output. Surprisingly, few studies have directly investigated such auditory-motor learning in the speech production of typically developing children.

Methodology/Principal Findings

In the present study, we manipulated auditory feedback during speech production in a group of 9–11-year old children, as well as in adults. Following a period of speech practice under conditions of altered auditory feedback, compensatory changes in speech production and perception were examined. Consistent with prior studies, the adults exhibited compensatory changes in both their speech motor output and their perceptual representations of speech sound categories. The children exhibited compensatory changes in the motor domain, with a change in speech output that was similar in magnitude to that of the adults, however the children showed no reliable compensatory effect on their perceptual representations.

Conclusions

The results indicate that 9–11-year-old children, whose speech motor and perceptual abilities are still not fully developed, are nonetheless capable of auditory-feedback-based sensorimotor adaptation, supporting a role for such learning processes in speech motor development. Auditory feedback may play a more limited role, however, in the fine-tuning of children''s perceptual representations of speech sound categories.  相似文献   

9.
The potential role of a size-scaling principle in orofacial movements for speech was examined by using between-group (adults vs. 5-yr-old children) as well as within-group correlational analyses. Movements of the lower lip and jaw were recorded during speech production, and anthropometric measures of orofacial structures were made. Adult women produced speech movements of equal amplitude and velocity to those of adult men. The children produced speech movement amplitudes equal to those of adults, but they had significantly lower peak velocities of orofacial movement. Thus we found no evidence supporting a size-scaling principle for orofacial speech movements. Young children have a relatively large-amplitude, low-velocity movement strategy for speech production compared with young adults. This strategy may reflect the need for more time to plan speech movement sequences and an increased reliance on sensory feedback as young children develop speech motor control processes.  相似文献   

10.
One of the central problems of psychology is the relationship between thought and speech, especially the significance of internal speech. An analysis of the unity of thought and speech is intimately linked with the analysis of internal speech, in which this unity finds its most distinct expression.  相似文献   

11.
It is well known that simultaneous presentation of incongruent audio and visual stimuli can lead to illusory percepts. Recent data suggest that distinct processes underlie non-specific intersensory speech as opposed to non-speech perception. However, the development of both speech and non-speech intersensory perception across childhood and adolescence remains poorly defined. Thirty-eight observers aged 5 to 19 were tested on the McGurk effect (an audio-visual illusion involving speech), the Illusory Flash effect and the Fusion effect (two audio-visual illusions not involving speech) to investigate the development of audio-visual interactions and contrast speech vs. non-speech developmental patterns. Whereas the strength of audio-visual speech illusions varied as a direct function of maturational level, performance on non-speech illusory tasks appeared to be homogeneous across all ages. These data support the existence of independent maturational processes underlying speech and non-speech audio-visual illusory effects.  相似文献   

12.
Luo H  Poeppel D 《Neuron》2007,54(6):1001-1010
How natural speech is represented in the auditory cortex constitutes a major challenge for cognitive neuroscience. Although many single-unit and neuroimaging studies have yielded valuable insights about the processing of speech and matched complex sounds, the mechanisms underlying the analysis of speech dynamics in human auditory cortex remain largely unknown. Here, we show that the phase pattern of theta band (4-8 Hz) responses recorded from human auditory cortex with magnetoencephalography (MEG) reliably tracks and discriminates spoken sentences and that this discrimination ability is correlated with speech intelligibility. The findings suggest that an approximately 200 ms temporal window (period of theta oscillation) segments the incoming speech signal, resetting and sliding to track speech dynamics. This hypothesized mechanism for cortical speech analysis is based on the stimulus-induced modulation of inherent cortical rhythms and provides further evidence implicating the syllable as a computational primitive for the representation of spoken language.  相似文献   

13.
14.
Eight patients with Down syndrome, aged 9 years and 10 months to 25 years and 4 months, underwent partial glossectomy. Preoperative and postoperative videotaped samples of spoken words and connected speech were randomized and rated by two groups of listeners, only one of which knew of the surgery. Aesthetic appearance of speech or visual acceptability of the patient while speaking was judged from visual information only. Judgments of speech intelligibility were made from the auditory portion of the videotapes. Acceptability and intelligibility also were judged together during audiovisual presentation. Statistical analysis revealed that speech was significantly more acceptable aesthetically after surgery. No significant difference was found in speech intelligibility preoperatively and postoperatively. Ratings did not differ significantly depending on whether the rater knew of the surgery. Analysis of results obtained in various presentation modes revealed that the aesthetics of speech did not significantly affect judgment of intelligibility. Conversely, speech acceptability was greater in the presence of higher levels of intelligibility.  相似文献   

15.
The speech code is a vehicle of language: it defines a set of forms used by a community to carry information. Such a code is necessary to support the linguistic interactions that allow humans to communicate. How then may a speech code be formed prior to the existence of linguistic interactions? Moreover, the human speech code is discrete and compositional, shared by all the individuals of a community but different across communities, and phoneme inventories are characterized by statistical regularities. How can a speech code with these properties form? We try to approach these questions in the paper, using the "methodology of the artificial". We build a society of artificial agents, and detail a mechanism that shows the formation of a discrete speech code without pre-supposing the existence of linguistic capacities or of coordinated interactions. The mechanism is based on a low-level model of sensory-motor interactions. We show that the integration of certain very simple and non-language-specific neural devices leads to the formation of a speech code that has properties similar to the human speech code. This result relies on the self-organizing properties of a generic coupling between perception and production within agents, and on the interactions between agents. The artificial system helps us to develop better intuitions on how speech might have appeared, by showing how self-organization might have helped natural selection to find speech.  相似文献   

16.
Numerous speech processing techniques have been applied to assist hearing-impaired subjects with extreme high-frequency hearing losses who can be helped only to a limited degree with conventional hearing aids. The results of providing this class of deaf subjects with a speech encoding hearing aid, which is able to reproduce intelligible speech for their particular needs, have generally been disappointing. There are at least four problems related to bandwidth compression applied to the voiced portion of speech: (1) the problem of pitch extraction in real time; (2) pitch extraction under realistic listening conditions, i.e. when competing speech and noise sources are present; (3) an insufficient data base for successful compression of voiced speech; and (4) the introduction of undesirable spectral energies in the bandwidth-compressed signal, due to the compression process itself. Experiments seem to indicate that voiced speech segments bandwidth limited to f = 1000 Hz, even at a loss of higher formant frequencies, is in most instances superior in intelligibility compared to bandwidth-compressed voiced speech segments of the same bandwidth, even if pitch can be extracted with no error. With the added complexity of real-time pitch extraction which has to function in actual listening conditions, it is doubtful that a speech encoding hearing aid, based on bandwidth compression on the voiced portion of speech, could be successfully implemented. However, if bandwidth compression is applied to the unvoiced portions of speech only, the above limitations can be overcome (1).  相似文献   

17.
Hearing one’s own voice is critical for fluent speech production as it allows for the detection and correction of vocalization errors in real time. This behavior known as the auditory feedback control of speech is impaired in various neurological disorders ranging from stuttering to aphasia; however, the underlying neural mechanisms are still poorly understood. Computational models of speech motor control suggest that, during speech production, the brain uses an efference copy of the motor command to generate an internal estimate of the speech output. When actual feedback differs from this internal estimate, an error signal is generated to correct the internal estimate and update necessary motor commands to produce intended speech. We were able to localize the auditory error signal using electrocorticographic recordings from neurosurgical participants during a delayed auditory feedback (DAF) paradigm. In this task, participants hear their voice with a time delay as they produced words and sentences (similar to an echo on a conference call), which is well known to disrupt fluency by causing slow and stutter-like speech in humans. We observed a significant response enhancement in auditory cortex that scaled with the duration of feedback delay, indicating an auditory speech error signal. Immediately following auditory cortex, dorsal precentral gyrus (dPreCG), a region that has not been implicated in auditory feedback processing before, exhibited a markedly similar response enhancement, suggesting a tight coupling between the 2 regions. Critically, response enhancement in dPreCG occurred only during articulation of long utterances due to a continuous mismatch between produced speech and reafferent feedback. These results suggest that dPreCG plays an essential role in processing auditory error signals during speech production to maintain fluency.

Hearing one’s own voice is critical for fluent speech production, allowing detection and correction of vocalization errors in real-time. This study shows that the dorsal precentral gyrus is a critical component of a cortical network that monitors auditory feedback to produce fluent speech; this region is engaged specifically when speech production is effortful during articulation of long utterances.  相似文献   

18.
This paper reviews the basic aspects of auditory processing that play a role in the perception of speech. The frequency selectivity of the auditory system, as measured using masking experiments, is described and used to derive the internal representation of the spectrum (the excitation pattern) of speech sounds. The perception of timbre and distinctions in quality between vowels are related to both static and dynamic aspects of the spectra of sounds. The perception of pitch and its role in speech perception are described. Measures of the temporal resolution of the auditory system are described and a model of temporal resolution based on a sliding temporal integrator is outlined. The combined effects of frequency and temporal resolution can be modelled by calculation of the spectro-temporal excitation pattern, which gives good insight into the internal representation of speech sounds. For speech presented in quiet, the resolution of the auditory system in frequency and time usually markedly exceeds the resolution necessary for the identification or discrimination of speech sounds, which partly accounts for the robust nature of speech perception. However, for people with impaired hearing, speech perception is often much less robust.  相似文献   

19.
The mathematical model is offered to describe an algorithm for functioning of a speech rhythm. The duration of a speech signal is divided into the numbered sequence of durations of voice and voiceless segments. All elements of this sequence will be considered as values normalized on the maximum element. We determine this sequence of the elements as a speech rhythm. 1) The model describes a speech rhythm as the recurrent relations between elements of a rhythm. 2) The model permits use of the concept of information entropy. 3) The model explains experimental findings obtained by our research group during comparative investigation of a rhythm in normal speech and stuttering. In particular, the model explains the existence of two classes of stutterers with various rhythms of speech.  相似文献   

20.
目的:探讨体外反搏联合言语训练治疗脑性瘫痪并语言发育迟缓儿童的临床疗效。方法:选择2015年12月至2017年12月在上海市儿童医院康复科普通门诊确诊的脑瘫并语言发育迟缓患儿52例,按照随机数字表法将其随机分为治疗组和对照组,每组26例。对照组仅给予言语训练治疗,治疗组给予体外反搏联合言语训练治,以4周1个疗程,两组均治疗3个疗程。治疗前后,采用中国康复研究中心汉语版儿童语言发育评定法s-s法、Gesell发育评分法评价和比较患儿言语发育商和认知发育商的变化。结果:治疗后,两组言语发育商和认知发育商均显著高于治疗前,且治疗组言语发育商和认知发育商均显著高于对照组,差异均有统计学意义(均P0.01)。结论:体外反搏联合言语训练较单纯言语训练可更有效改善脑瘫并语言发育迟缓患儿的言语发育和认知发育。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号