首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The performance of objective speech and audio quality measures for the prediction of the perceived quality of frequency-compressed speech in hearing aids is investigated in this paper. A number of existing quality measures have been applied to speech signals processed by a hearing aid, which compresses speech spectra along frequency in order to make information contained in higher frequencies audible for listeners with severe high-frequency hearing loss. Quality measures were compared with subjective ratings obtained from normal hearing and hearing impaired children and adults in an earlier study. High correlations were achieved with quality measures computed by quality models that are based on the auditory model of Dau et al., namely, the measure PSM, computed by the quality model PEMO-Q; the measure qc, computed by the quality model proposed by Hansen and Kollmeier; and the linear subcomponent of the HASQI. For the prediction of quality ratings by hearing impaired listeners, extensions of some models incorporating hearing loss were implemented and shown to achieve improved prediction accuracy. Results indicate that these objective quality measures can potentially serve as tools for assisting in initial setting of frequency compression parameters.  相似文献   

2.
It was found that, at a test bandwidth range of 50 Hz, 100% speech intelligibility is retained in naive subjects when, on average, 950 Hz is removed from each subsequent 1000-Hz bandwidth. Thus, speech is 95% redundant with respect to the spectral content. The parameters of the comb filter were chosen from measurements of speech intelligibility in experienced subjects, at which no one subject with normal hearing taking part in the experiment for the first time exhibited 100% intelligibility. Two methods of learning to perceive spectrally deprived speech signals are compared: (1) aurally only and (2) with visual enhancement. In the latter case, speech intelligibility is significantly higher. The possibility of using a spectrally deprived speech signal to develop and assess the efficiency of auditory rehabilitation of implanted patients is discussed.  相似文献   

3.
Speech perception is thought to be linked to speech motor production. This linkage is considered to mediate multimodal aspects of speech perception, such as audio-visual and audio-tactile integration. However, direct coupling between articulatory movement and auditory perception has been little studied. The present study reveals a clear dissociation between the effects of a listener’s own speech action and the effects of viewing another’s speech movements on the perception of auditory phonemes. We assessed the intelligibility of the syllables [pa], [ta], and [ka] when listeners silently and simultaneously articulated syllables that were congruent/incongruent with the syllables they heard. The intelligibility was compared with a condition where the listeners simultaneously watched another’s mouth producing congruent/incongruent syllables, but did not articulate. The intelligibility of [ta] and [ka] were degraded by articulating [ka] and [ta] respectively, which are associated with the same primary articulator (tongue) as the heard syllables. But they were not affected by articulating [pa], which is associated with a different primary articulator (lips) from the heard syllables. In contrast, the intelligibility of [ta] and [ka] was degraded by watching the production of [pa]. These results indicate that the articulatory-induced distortion of speech perception occurs in an articulator-specific manner while visually induced distortion does not. The articulator-specific nature of the auditory-motor interaction in speech perception suggests that speech motor processing directly contributes to our ability to hear speech.  相似文献   

4.
Eight patients with Down syndrome, aged 9 years and 10 months to 25 years and 4 months, underwent partial glossectomy. Preoperative and postoperative videotaped samples of spoken words and connected speech were randomized and rated by two groups of listeners, only one of which knew of the surgery. Aesthetic appearance of speech or visual acceptability of the patient while speaking was judged from visual information only. Judgments of speech intelligibility were made from the auditory portion of the videotapes. Acceptability and intelligibility also were judged together during audiovisual presentation. Statistical analysis revealed that speech was significantly more acceptable aesthetically after surgery. No significant difference was found in speech intelligibility preoperatively and postoperatively. Ratings did not differ significantly depending on whether the rater knew of the surgery. Analysis of results obtained in various presentation modes revealed that the aesthetics of speech did not significantly affect judgment of intelligibility. Conversely, speech acceptability was greater in the presence of higher levels of intelligibility.  相似文献   

5.
BackgroundAuditory neuropathy (AN) is a recently recognized hearing disorder characterized by intact outer hair cell function, disrupted auditory nerve synchronization and poor speech perception and recognition. Cochlear implants (CIs) are currently the most promising intervention for improving hearing and speech in individuals with AN. Although previous studies have shown optimistic results, there was large variability concerning benefits of CIs among individuals with AN. The data indicate that different criteria are needed to evaluate the benefit of CIs in these children compared to those with sensorineural hearing loss. We hypothesized that a hierarchic assessment would be more appropriate to evaluate the benefits of cochlear implantation in AN individuals.MethodsEight prelingual children with AN who received unilateral CIs were included in this study. Hearing sensitivity and speech recognition were evaluated pre- and postoperatively within each subject. The efficacy of cochlear implantation was assessed using a stepwise hierarchic evaluation for achieving: (1) effective audibility, (2) improved speech recognition, (3) effective speech, and (4) effective communication.ResultsThe postoperative hearing and speech performance varied among the subjects. According to the hierarchic assessment, all eight subjects approached the primary level of effective audibility, with an average implanted hearing threshold of 43.8 ± 10.2 dB HL. Five subjects (62.5%) attained the level of improved speech recognition, one (12.5%) reached the level of effective speech, and none of the subjects (0.0%) achieved effective communication.ConclusionCIs benefit prelingual children with AN to varying extents. A hierarchic evaluation provides a more suitable method to determine the benefits that AN individuals will likely receive from cochlear implantation.  相似文献   

6.
Reverberation is known to reduce the temporal envelope modulations present in the signal and affect the shape of the modulation spectrum. A non-intrusive intelligibility measure for reverberant speech is proposed motivated by the fact that the area of the modulation spectrum decreases with increasing reverberation. The proposed measure is based on the average modulation area computed across four acoustic frequency bands spanning the signal bandwidth. High correlations (r = 0.98) were observed with sentence intelligibility scores obtained by cochlear implant listeners. Proposed measure outperformed other measures including an intrusive speech-transmission index based measure.  相似文献   

7.
Anecdotally, middle-aged listeners report difficulty conversing in social settings, even when they have normal audiometric thresholds [1-3]. Moreover, young adult listeners with "normal" hearing vary in their ability to selectively attend to speech amid similar streams of speech. Ignoring age, these individual differences correlate with physiological differences in temporal coding precision present in the auditory brainstem, suggesting that the fidelity of encoding of suprathreshold sound helps explain individual differences [4]. Here, we revisit the conundrum of whether early aging influences an individual's ability to communicate in everyday settings. Although absolute selective attention ability is not predicted by age, reverberant energy interferes more with selective attention as age increases. Breaking the brainstem response down into components corresponding to coding of?stimulus fine structure and envelope, we find that age alters which brainstem component predicts performance. Specifically, middle-aged listeners appear to rely heavily on temporal fine structure, which is more disrupted by reverberant energy than temporal envelope structure is. In contrast, the fidelity of envelope cues predicts performance in younger adults. These results hint that temporal envelope cues influence spatial hearing in reverberant settings more than is commonly appreciated and help explain why middle-aged listeners have particular difficulty communicating in daily life.  相似文献   

8.
Schaette R  Turtle C  Munro KJ 《PloS one》2012,7(6):e35238
Tinnitus, a phantom auditory sensation, is associated with hearing loss in most cases, but it is unclear if hearing loss causes tinnitus. Phantom auditory sensations can be induced in normal hearing listeners when they experience severe auditory deprivation such as confinement in an anechoic chamber, which can be regarded as somewhat analogous to a profound bilateral hearing loss. As this condition is relatively uncommon among tinnitus patients, induction of phantom sounds by a lesser degree of auditory deprivation could advance our understanding of the mechanisms of tinnitus. In this study, we therefore investigated the reporting of phantom sounds after continuous use of an earplug. 18 healthy volunteers with normal hearing wore a silicone earplug continuously in one ear for 7 days. The attenuation provided by the earplugs simulated a mild high-frequency hearing loss, mean attenuation increased from <10 dB at 0.25 kHz to >30 dB at 3 and 4 kHz. 14 out of 18 participants reported phantom sounds during earplug use. 11 participants presented with stable phantom sounds on day 7 and underwent tinnitus spectrum characterization with the earplug still in place. The spectra showed that the phantom sounds were perceived predominantly as high-pitched, corresponding to the frequency range most affected by the earplug. In all cases, the auditory phantom disappeared when the earplug was removed, indicating a causal relation between auditory deprivation and phantom sounds. This relation matches the predictions of our computational model of tinnitus development, which proposes a possible mechanism by which a stabilization of neuronal activity through homeostatic plasticity in the central auditory system could lead to the development of a neuronal correlate of tinnitus when auditory nerve activity is reduced due to the earplug.  相似文献   

9.
The intelligibility of periodically interrupted speech improves once the silent gaps are filled with noise bursts. This improvement has been attributed to phonemic restoration, a top-down repair mechanism that helps intelligibility of degraded speech in daily life. Two hypotheses were investigated using perceptual learning of interrupted speech. If different cognitive processes played a role in restoring interrupted speech with and without filler noise, the two forms of speech would be learned at different rates and with different perceived mental effort. If the restoration benefit were an artificial outcome of using the ecologically invalid stimulus of speech with silent gaps, this benefit would diminish with training. Two groups of normal-hearing listeners were trained, one with interrupted sentences with the filler noise, and the other without. Feedback was provided with the auditory playback of the unprocessed and processed sentences, as well as the visual display of the sentence text. Training increased the overall performance significantly, however restoration benefit did not diminish. The increase in intelligibility and the decrease in perceived mental effort were relatively similar between the groups, implying similar cognitive mechanisms for the restoration of the two types of interruptions. Training effects were generalizable, as both groups improved their performance also with the other form of speech than that they were trained with, and retainable. Due to null results and relatively small number of participants (10 per group), further research is needed to more confidently draw conclusions. Nevertheless, training with interrupted speech seems to be effective, stimulating participants to more actively and efficiently use the top-down restoration. This finding further implies the potential of this training approach as a rehabilitative tool for hearing-impaired/elderly populations.  相似文献   

10.
One of the most common examples of audiovisual speech integration is the McGurk effect. As an example, an auditory syllable /ba/ recorded over incongruent lip movements that produce “ga” typically causes listeners to hear “da”. This report hypothesizes reasons why certain clinical and listeners who are hard of hearing might be more susceptible to visual influence. Conversely, we also examine why other listeners appear less susceptible to the McGurk effect (i.e., they report hearing just the auditory stimulus without being influenced by the visual). Such explanations are accompanied by a mechanistic explanation of integration phenomena including visual inhibition of auditory information, or slower rate of accumulation of inputs. First, simulations of a linear dynamic parallel interactive model were instantiated using inhibition and facilitation to examine potential mechanisms underlying integration. In a second set of simulations, we systematically manipulated the inhibition parameter values to model data obtained from listeners with autism spectrum disorder. In summary, we argue that cross-modal inhibition parameter values explain individual variability in McGurk perceptibility. Nonetheless, different mechanisms should continue to be explored in an effort to better understand current data patterns in the audiovisual integration literature.  相似文献   

11.
Spatial release from masking refers to a benefit for speech understanding. It occurs when a target talker and a masker talker are spatially separated. In those cases, speech intelligibility for target speech is typically higher than when both talkers are at the same location. In cochlear implant listeners, spatial release from masking is much reduced or absent compared with normal hearing listeners. Perhaps this reduced spatial release occurs because cochlear implant listeners cannot effectively attend to spatial cues. Three experiments examined factors that may interfere with deploying spatial attention to a target talker masked by another talker. To simulate cochlear implant listening, stimuli were vocoded with two unique features. First, we used 50-Hz low-pass filtered speech envelopes and noise carriers, strongly reducing the possibility of temporal pitch cues; second, co-modulation was imposed on target and masker utterances to enhance perceptual fusion between the two sources. Stimuli were presented over headphones. Experiments 1 and 2 presented high-fidelity spatial cues with unprocessed and vocoded speech. Experiment 3 maintained faithful long-term average interaural level differences but presented scrambled interaural time differences with vocoded speech. Results show a robust spatial release from masking in Experiments 1 and 2, and a greatly reduced spatial release in Experiment 3. Faithful long-term average interaural level differences were insufficient for producing spatial release from masking. This suggests that appropriate interaural time differences are necessary for restoring spatial release from masking, at least for a situation where there are few viable alternative segregation cues.  相似文献   

12.
More than 500 pinpoint physiological experiments were performed to compare the state and development of visual and auditory disturbances during rotation on an 8-m armed centrifuge at 14 g. A high level instrumentation provided a broad variety of measurements within very short intervals, including the acuity and angle of vision, absolute light sensitivity, critical flicker frequency, thresholds of tone hearing, and speech intelligibility. The state of eyeground blood vessels under acceleration was estimated both remotely and by an ophthalmologist sitting by the subject during rotation. Experimental results showed the impairment of the visual function during an acceleration increase and an almost complete loss of vision at 12–14 g. As regards the hearing function, acoustic energy incurred some loss on the way to Corti’s organ and yet hearing remained good enough to support operator’s orientation in the whole range of accelerations not resulting in the loss of consciousness. The significance of hearing retention is confirmed by successful experience of using audio information in the study of effectiveness of manual spacecraft operation at 18 g at the same laboratory. Comparative analysis of these and other laboratory and literature data makes it possible to assess the functional state of different components of light and acoustic energy transportation and conversion on the way to the central representations and to reveal the weak points of both analyzers. The data favor the view of retina as a blocker of light energy transportation. Development of this phenomenon is associated primarily with blood circulation impairment in the a. centralis retinae.  相似文献   

13.
Twenty-three children with Down's syndrome, aged between 3.7 and 17.5 years, underwent partial glossectomy for improvement of cosmetic appearance. Improved speech was also expected. Preoperative and postoperative audiotaped samples of spoken words and connected speech on a standardized articulation test were rated by three lay and three expert listeners on a five-point intelligibility scale. Five subjects were eliminated from both tasks and another four from connected-speech testing because of inability to complete the experimental tasks. Statistical analyses of ratings for words in 18 subjects and connected speech in 14 of them revealed no significant difference in acoustic speech intelligibility preoperatively and postoperatively. The findings suggest that a wedge-excision partial glossectomy in children with Down's syndrome does not result in significant improvement in acoustic speech intelligibility; in some patients, however, there may be an aesthetic improvement during speech.  相似文献   

14.
目的:探讨单侧人工耳蜗植入(cochlear implantation,CI)对学龄前耳聋儿童听觉语言康复的治疗效果以及相关影响因素。方法:将我院自2017年1月至2017年12月行CI治疗的学龄前儿童72例行作为研究对象,通过问卷调查手术患儿的相关资料,对可能影响患儿听觉言语康复效果的因素和听觉行为分级(Categories of auditory performance,CAP)以及言语可懂程度分级(Speech intelligibility rating,SIR)结果进行二分类变量的单因素分析,再进行多分类变量的Logistic回归分析评估患儿的治疗效果和影响康复效果的因素。结果:耳聋患儿CI植入年龄、术前平均残余听力、术前佩戴助听器时间、使用人工耳蜗时间和术后语训时间等因素和CAP增长倍数之间有明显的相关性(P0.05),除了上述因素之外还有术前语训时间等因素与治疗后患儿SIR增长倍数存在相关性(P0.05);CI植入年龄、术前平均残余听力和术前佩戴助听器时间对患儿术后CAP的恢复具有影响(P0.05);CI植入年龄、术前佩戴助听器时间、术前语训时间等因素对患儿SIR恢复产生影响(P0.05)。结论:患儿植入人工耳蜗的年龄、术前平均残余听力、术前佩戴助听器时间和术前言语训练时间是影响学龄前耳聋患儿术后听力言语功能恢复的主要因素。  相似文献   

15.

Objectives

(1) To report the speech perception and intelligibility results of Mandarin-speaking patients with large vestibular aqueduct syndrome (LVAS) after cochlear implantation (CI); (2) to compare their performance with a group of CI users without LVAS; (3) to understand the effects of age at implantation and duration of implant use on the CI outcomes. The obtained data may be used to guide decisions about CI candidacy and surgical timing.

Methods

Forty-two patients with LVAS participating in this study were divided into two groups: the early group received CI before 5 years of age and the late group after 5. Open-set speech perception tests (on Mandarin tones, words and sentences) were administered one year after implantation and at the most recent follow-up visit. Categories of auditory perception (CAP) and Speech Intelligibility Rating (SIR) scale scores were also obtained.

Results

The patients with LVAS with more than 5 years of implant use (18 cases) achieved a mean score higher than 80% on the most recent speech perception tests and reached the highest level on the CAP/SIR scales. The early group developed speech perception and intelligibility steadily over time, while the late group had a rapid improvement during the first year after implantation. The two groups, regardless of their age at implantation, reached a similar performance level at the most recent follow-up visit.

Conclusion

High levels of speech performance are reached after 5 years of implant use in patients with LVAS. These patients do not necessarily need to wait until their hearing thresholds are higher than 90 dB HL or PB word score lower than 40% to receive CI. They can do it “earlier” when their speech perception and/or speech intelligibility do not reach the performance level suggested in this study.  相似文献   

16.
We studied auditory short-latency brainstem and long-latency cortical evoked potentials (EP)in 62 healthy children and 126 children with spastic forms of childrens cerebral palsy, CP (spastic tetraparesis, spastic diplegia, and left-and right-side hemiplegias). An increase in the thresholds of audibility (independently of the CP form) was the most typical disturbance of the function of hearing revealed by the analysis of EP recorded in children suffering from CP. Disturbances in transmission of afferent impulsation in the brain-stem structures of the auditory system and disorders in the perception of different tones within the speech frequency range were also rather frequent. Modifications of the brainstem and cortical auditory EP typical of different CP forms, in particular hemiplegias, are described. It is demonstrated that recording and analysis of EP allow one to diagnose in children with CP those disorders in the hearing function that, in many cases, are of a subclinical nature. This technique allows clinicians to examine the youngest children (when verbal contact with the child is difficult or impossible), to study brainstem EP, and to obtain more objective data; these are significant advantages, as compared with subjective audiometry.Neirofiziologiya/Neurophysiology, Vol.36, No.4, pp.306–312, July-August, 2004.  相似文献   

17.
The audibility of a target tone in a multitone background masker is enhanced by the presentation of a precursor sound consisting of the masker alone. There is evidence that precursor-induced neural adaptation plays a role in this perceptual enhancement. However, the precursor may also be strategically used by listeners as a spectral template of the following masker to better segregate it from the target. In the present study, we tested this hypothesis by measuring the audibility of a target tone in a multitone masker after the presentation of precursors which, in some conditions, were made dissimilar to the masker by gating their components asynchronously. The precursor and the following sound were presented either to the same ear or to opposite ears. In either case, we found no significant difference in the amount of enhancement produced by synchronous and asynchronous precursors. In a second experiment, listeners had to judge whether a synchronous multitone complex contained exactly the same tones as a preceding precursor complex or had one tone less. In this experiment, listeners performed significantly better with synchronous than with asynchronous precursors, showing that asynchronous precursors were poorer perceptual templates of the synchronous multitone complexes. Overall, our findings indicate that precursor-induced auditory enhancement cannot be fully explained by the strategic use of the precursor as a template of the following masker. Our results are consistent with an explanation of enhancement based on selective neural adaptation taking place at a central locus of the auditory system.  相似文献   

18.
How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.  相似文献   

19.
The most common complaint of older hearing impaired (OHI) listeners is difficulty understanding speech in the presence of noise. However, tests of consonant-identification and sentence reception threshold (SeRT) provide different perspectives on the magnitude of impairment. Here we quantified speech perception difficulties in 24 OHI listeners in unaided and aided conditions by analyzing (1) consonant-identification thresholds and consonant confusions for 20 onset and 20 coda consonants in consonant-vowel-consonant (CVC) syllables presented at consonant-specific signal-to-noise (SNR) levels, and (2) SeRTs obtained with the Quick Speech in Noise Test (QSIN) and the Hearing in Noise Test (HINT). Compared to older normal hearing (ONH) listeners, nearly all unaided OHI listeners showed abnormal consonant-identification thresholds, abnormal consonant confusions, and reduced psychometric function slopes. Average elevations in consonant-identification thresholds exceeded 35 dB, correlated strongly with impairments in mid-frequency hearing, and were greater for hard-to-identify consonants. Advanced digital hearing aids (HAs) improved average consonant-identification thresholds by more than 17 dB, with significant HA benefit seen in 83% of OHI listeners. HAs partially normalized consonant-identification thresholds, reduced abnormal consonant confusions, and increased the slope of psychometric functions. Unaided OHI listeners showed much smaller elevations in SeRTs (mean 6.9 dB) than in consonant-identification thresholds and SeRTs in unaided listening conditions correlated strongly (r = 0.91) with identification thresholds of easily identified consonants. HAs produced minimal SeRT benefit (2.0 dB), with only 38% of OHI listeners showing significant improvement. HA benefit on SeRTs was accurately predicted (r = 0.86) by HA benefit on easily identified consonants. Consonant-identification tests can accurately predict sentence processing deficits and HA benefit in OHI listeners.  相似文献   

20.
The human auditory system is adept at detecting sound sources of interest from a complex mixture of several other simultaneous sounds. The ability to selectively attend to the speech of one speaker whilst ignoring other speakers and background noise is of vital biological significance—the capacity to make sense of complex ‘auditory scenes’ is significantly impaired in aging populations as well as those with hearing loss. We investigated this problem by designing a synthetic signal, termed the ‘stochastic figure-ground’ stimulus that captures essential aspects of complex sounds in the natural environment. Previously, we showed that under controlled laboratory conditions, young listeners sampled from the university subject pool (n = 10) performed very well in detecting targets embedded in the stochastic figure-ground signal. Here, we presented a modified version of this cocktail party paradigm as a ‘game’ featured in a smartphone app (The Great Brain Experiment) and obtained data from a large population with diverse demographical patterns (n = 5148). Despite differences in paradigms and experimental settings, the observed target-detection performance by users of the app was robust and consistent with our previous results from the psychophysical study. Our results highlight the potential use of smartphone apps in capturing robust large-scale auditory behavioral data from normal healthy volunteers, which can also be extended to study auditory deficits in clinical populations with hearing impairments and central auditory disorders.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号