首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Reaction time and recognition accuracy of speech emotional intonations in short meaningless words that differed only in one phoneme with background noise and without it were studied in 49 adults of 20-79 years old. The results were compared with the same parameters of emotional intonations in intelligent speech utterances under similar conditions. Perception of emotional intonations at different linguistic levels (phonological and lexico-semantic) was found to have both common features and certain peculiarities. Recognition characteristics of emotional intonations depending on gender and age of listeners appeared to be invariant with regard to linguistic levels of speech stimuli. Phonemic composition of pseudowords was found to influence the emotional perception, especially against the background noise. The most significant stimuli acoustic characteristic responsible for the perception of speech emotional prosody in short meaningless words under the two experimental conditions, i.e. with and without background noise, was the fundamental frequency variation.  相似文献   

2.
基于时间机理与部位机理整合的鲁棒性语音信号表达   总被引:1,自引:0,他引:1  
传统语音信号谱特征的提取是基于FFT 的能谱分析方法,在噪音环境情况下,对噪音的频谱成分与语音信号的频谱成分的处理采用“平均主义”的原则。也就是说噪音的频谱成分与语音信号的频谱成分占同等重要的地位。显然在噪音环境中这种处理方法会使噪音掩蔽掉语音信号的成分。在听觉系统中这种处理编码方式如同耳蜗滤波器的频率分析功能那样,也就是部位机理。实际上听觉系统对噪音和周期信号的处理不是“平均主义”原则,而是对周期信号敏感, 对噪音不敏感,听觉神经纤维通过神经脉冲发放的周期间隔来编码刺激信号, 这对应听觉处理机制中的时间编码方式。基于这两种处理机制,文中提出整合部位机理和时间机理的方法,这正是听觉的处理刺激的方式。这样处理的方法很好地结合了两种处理机制的优点,能有效地探测噪音环境中的语音信号  相似文献   

3.
This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.  相似文献   

4.

Background

Improvement of the cochlear implant (CI) front-end signal acquisition is needed to increase speech recognition in noisy environments. To suppress the directional noise, we introduce a speech-enhancement algorithm based on microphone array beamforming and spectral estimation. The experimental results indicate that this method is robust to directional mobile noise and strongly enhances the desired speech, thereby improving the performance of CI devices in a noisy environment.

Methods

The spectrum estimation and the array beamforming methods were combined to suppress the ambient noise. The directivity coefficient was estimated in the noise-only intervals, and was updated to fit for the mobile noise.

Results

The proposed algorithm was realized in the CI speech strategy. For actual parameters, we use Maxflat filter to obtain fractional sampling points and cepstrum method to differentiate the desired speech frame and the noise frame. The broadband adjustment coefficients were added to compensate the energy loss in the low frequency band.

Discussions

The approximation of the directivity coefficient is tested and the errors are discussed. We also analyze the algorithm constraint for noise estimation and distortion in CI processing. The performance of the proposed algorithm is analyzed and further be compared with other prevalent methods.

Conclusions

The hardware platform was constructed for the experiments. The speech-enhancement results showed that our algorithm can suppresses the non-stationary noise with high SNR. Excellent performance of the proposed algorithm was obtained in the speech enhancement experiments and mobile testing. And signal distortion results indicate that this algorithm is robust with high SNR improvement and low speech distortion.  相似文献   

5.
A significant fraction of newly implanted cochlear implant recipients use a hearing aid in their non-implanted ear. SCORE bimodal is a sound processing strategy developed for this configuration, aimed at normalising loudness perception and improving binaural loudness balance. Speech perception performance in quiet and noise and sound localisation ability of six bimodal listeners were measured with and without application of SCORE. Speech perception in quiet was measured either with only acoustic, only electric, or bimodal stimulation, at soft and normal conversational levels. For speech in quiet there was a significant improvement with application of SCORE. Speech perception in noise was measured for either steady-state noise, fluctuating noise, or a competing talker, at conversational levels with bimodal stimulation. For speech in noise there was no significant effect of application of SCORE. Modelling of interaural loudness differences in a long-term-average-speech-spectrum-weighted click train indicated that left-right discrimination of sound sources can improve with application of SCORE. As SCORE was found to leave speech perception unaffected or to improve it, it seems suitable for implementation in clinical devices.  相似文献   

6.
The purpose of the present study was to determine whether different cues to increase loudness in speech result in different internal targets (or goals) for respiratory movement and whether the neural control of the respiratory system is sensitive to changes in the speaker's internal loudness target. This study examined respiratory mechanisms during speech in 30 young adults at comfortable level and increased loudness levels. Increased loudness was elicited using three methods: asking subjects to target a specific sound pressure level, asking subjects to speak twice as loud as comfortable, and asking subjects to speak in noise. All three loud conditions resulted in similar increases in sound pressure level . However, the respiratory mechanisms used to support the increase in loudness differed significantly depending on how the louder speech was elicited. When asked to target at a particular sound pressure level, subjects used a mechanism of increasing the lung volume at which speech was initiated to take advantage of higher recoil pressures. When asked to speak twice as loud as comfortable, subjects increased expiratory muscle tension, for the most part, to increase the pressure for speech. However, in the most natural of the elicitation methods, speaking in noise, the subjects used a combined respiratory approach, using both increased recoil pressures and increased expiratory muscle tension. In noise, an additional target, possibly improving intelligibility of speech, was reflected in the slowing of speech rate and in larger volume excursions even though the speakers were producing the same number of syllables.  相似文献   

7.
The study was devoted to the perception of the speech emotional component by stuttering children under noise impact conditions. The method of study was the evaluation of the probability of an accurate identification of various emotions. Stuttering children were found to be less efficient in identifying all emotions. This fact permitted the assumption that the mechanisms ensuring the identification of emotions against the background of noise by stuttering children form in ontogeny later than that in normally speaking children. Interference immunity of the perception of emotions depends on the emotional coloration of speech. The interhemispheric relations found during the perception of emotions are unstable and, when speech is masked by a noise, acquire the direction that is characteristic of normal children. Thus, the detected ontogenetic features permit one to assume that the establishment of the pattern of interhemispheric relations that is characteristic of normal children is among the reasons of the weakening of stuttering against the background of noise under such conditions.  相似文献   

8.
The intelligibility of periodically interrupted speech improves once the silent gaps are filled with noise bursts. This improvement has been attributed to phonemic restoration, a top-down repair mechanism that helps intelligibility of degraded speech in daily life. Two hypotheses were investigated using perceptual learning of interrupted speech. If different cognitive processes played a role in restoring interrupted speech with and without filler noise, the two forms of speech would be learned at different rates and with different perceived mental effort. If the restoration benefit were an artificial outcome of using the ecologically invalid stimulus of speech with silent gaps, this benefit would diminish with training. Two groups of normal-hearing listeners were trained, one with interrupted sentences with the filler noise, and the other without. Feedback was provided with the auditory playback of the unprocessed and processed sentences, as well as the visual display of the sentence text. Training increased the overall performance significantly, however restoration benefit did not diminish. The increase in intelligibility and the decrease in perceived mental effort were relatively similar between the groups, implying similar cognitive mechanisms for the restoration of the two types of interruptions. Training effects were generalizable, as both groups improved their performance also with the other form of speech than that they were trained with, and retainable. Due to null results and relatively small number of participants (10 per group), further research is needed to more confidently draw conclusions. Nevertheless, training with interrupted speech seems to be effective, stimulating participants to more actively and efficiently use the top-down restoration. This finding further implies the potential of this training approach as a rehabilitative tool for hearing-impaired/elderly populations.  相似文献   

9.
Invariant and noise-proof speech understanding is an important human ability, ensured by several mechanisms of the audioverbal system, which develops parallel to mastering linguistic rules. It is a fundamental problem of speech studies to clarify the mechanisms of this understanding, especially their role in the speech development. The article deals with of the regularities of auditory word recognition in noise by preschool children (healthy and with speech development disorders) and patients with cochlear implants. The authors studied the recognition of words using pictures (by children) and verbal monitoring, when the subjects were stimulated by isolated words with one or all syllables in noise. The study showed that children's ability to perceive distorted words develops in ontogeny and is closely related to the development of mental processes and mastering linguistic rules. The data on patients with cochlear implants also confirmed the key role of the central factors in understanding distorted speech.  相似文献   

10.
Much of our daily communication occurs in the presence of background noise, compromising our ability to hear. While understanding speech in noise is a challenge for everyone, it becomes increasingly difficult as we age. Although aging is generally accompanied by hearing loss, this perceptual decline cannot fully account for the difficulties experienced by older adults for hearing in noise. Decreased cognitive skills concurrent with reduced perceptual acuity are thought to contribute to the difficulty older adults experience understanding speech in noise. Given that musical experience positively impacts speech perception in noise in young adults (ages 18-30), we asked whether musical experience benefits an older cohort of musicians (ages 45-65), potentially offsetting the age-related decline in speech-in-noise perceptual abilities and associated cognitive function (i.e., working memory). Consistent with performance in young adults, older musicians demonstrated enhanced speech-in-noise perception relative to nonmusicians along with greater auditory, but not visual, working memory capacity. By demonstrating that speech-in-noise perception and related cognitive function are enhanced in older musicians, our results imply that musical training may reduce the impact of age-related auditory decline.  相似文献   

11.
This study investigated how speech recognition in noise is affected by language proficiency for individual non-native speakers. The recognition of English and Chinese sentences was measured as a function of the signal-to-noise ratio (SNR) in sixty native Chinese speakers who never lived in an English-speaking environment. The recognition score for speech in quiet (which varied from 15%–92%) was found to be uncorrelated with speech recognition threshold (SRTQ /2), i.e. the SNR at which the recognition score drops to 50% of the recognition score in quiet. This result demonstrates separable contributions of language proficiency and auditory processing to speech recognition in noise.  相似文献   

12.
The listener-distinctive features of recognition of different emotional intonations (positive, negative and neutral) of male and female speakers in the presence or absence of background noise were studied in 49 adults aged 20-79 years. In all the listeners noise produced the most pronounced decrease in recognition accuracy for positive emotional intonation ("joy") as compared to other intonations, whereas it did not influence the recognition accuracy of "anger" in 65-79-year-old listeners. The higher emotion recognition rates of a noisy signal were observed for speech emotional intonations expressed by female speakers. Acoustic characteristics of noisy and clear speech signals underlying perception of speech emotional prosody were found for adult listeners of different age and gender.  相似文献   

13.
Nonnative speech poses a challenge to speech perception, especially in challenging listening environments. Audiovisual (AV) cues are known to improve native speech perception in noise. The extent to which AV cues benefit nonnative speech perception in noise, however, is much less well-understood. Here, we examined native American English-speaking and native Korean-speaking listeners'' perception of English sentences produced by a native American English speaker and a native Korean speaker across a range of signal-to-noise ratios (SNRs;−4 to −20 dB) in audio-only and audiovisual conditions. We employed psychometric function analyses to characterize the pattern of AV benefit across SNRs. For native English speech, the largest AV benefit occurred at intermediate SNR (i.e. −12 dB); but for nonnative English speech, the largest AV benefit occurred at a higher SNR (−4 dB). The psychometric function analyses demonstrated that the AV benefit patterns were different between native and nonnative English speech. The nativeness of the listener exerted negligible effects on the AV benefit across SNRs. However, the nonnative listeners'' ability to gain AV benefit in native English speech was related to their proficiency in English. These findings suggest that the native language background of both the speaker and listener clearly modulate the optimal use of AV cues in speech recognition.  相似文献   

14.
Our auditory system has to organize and to pick up a target sound with many components, sometimes rejecting irrelevant sound components, but sometimes forming multiple streams including the target stream. This situation is well described with the concept of auditory scene analysis. Research on speech perception in noise is closely related to auditory scene analysis. This paper briefly reviews the concept of auditory scene analysis and previous and ongoing research on speech perception in noise, and discusses the future direction of research. Further experimental investigations are needed to understand our perceptual mechanisms better.  相似文献   

15.
We describe two design strategies that could substantially improve the performance of speech enhancement systems. Results from a preliminary study of pulse recovery are presented to illustrate the potential benefits of such strategies. The first strategy is a direct application of a non-linear, adaptive signal processing approach for recovery of speech in noise. The second strategy optimizes performance by maximizing the enhancement system's ability to evoke target speech percepts. This approach may lead to better performance because the design is optimized on a measure directly related to the ultimate goal of speech enhancement: accurate communication of the speech percept. In both systems, recently developed ‘neural network’ learning algorithms can be used to determine appropriate parameters for enhancement processing.  相似文献   

16.
Our multidisciplinary team obtained noise data in 27 San Francisco Bay Area restaurants. These data included typical minimum, peak, and average sound pressure levels; digital tape recordings; subjective noise ratings; and on-site unaided and aided speech discrimination tests. We report the details and implications of these noise measurements and provide basic information on selecting hearing aids and suggestions for coping with restaurant noise.  相似文献   

17.
The essential role of premotor cortex in speech perception   总被引:2,自引:0,他引:2  
Besides the involvement of superior temporal regions in processing complex speech sounds, evidence suggests that the motor system might also play a role [1-4]. This suggests that the hearer might perceive speech by simulating the articulatory gestures of the speaker [5, 6]. It is still an open question whether this simulation process is necessary for speech perception. We applied repetitive transcranial magnetic stimulation to the premotor cortex to disrupt subjects' ability to perform a phonetic discrimination task. Subjects were impaired in discriminating stop consonants in noise but were unaffected in a control task that was matched in difficulty, task structure, and response characteristics. These results show that the disruption of human premotor cortex impairs speech perception, thus demonstrating an essential role of premotor cortices in perceptual processes.  相似文献   

18.
In the real world, human speech recognition nearly always involves listening in background noise. The impact of such noise on speech signals and on intelligibility performance increases with the separation of the listener from the speaker. The present behavioral experiment provides an overview of the effects of such acoustic disturbances on speech perception in conditions approaching ecologically valid contexts. We analysed the intelligibility loss in spoken word lists with increasing listener-to-speaker distance in a typical low-level natural background noise. The noise was combined with the simple spherical amplitude attenuation due to distance, basically changing the signal-to-noise ratio (SNR). Therefore, our study draws attention to some of the most basic environmental constraints that have pervaded spoken communication throughout human history. We evaluated the ability of native French participants to recognize French monosyllabic words (spoken at 65.3 dB(A), reference at 1 meter) at distances between 11 to 33 meters, which corresponded to the SNRs most revealing of the progressive effect of the selected natural noise (−8.8 dB to −18.4 dB). Our results showed that in such conditions, identity of vowels is mostly preserved, with the striking peculiarity of the absence of confusion in vowels. The results also confirmed the functional role of consonants during lexical identification. The extensive analysis of recognition scores, confusion patterns and associated acoustic cues revealed that sonorant, sibilant and burst properties were the most important parameters influencing phoneme recognition. . Altogether these analyses allowed us to extract a resistance scale from consonant recognition scores. We also identified specific perceptual consonant confusion groups depending of the place in the words (onset vs. coda). Finally our data suggested that listeners may access some acoustic cues of the CV transition, opening interesting perspectives for future studies.  相似文献   

19.
The activation of listener''s motor system during speech processing was first demonstrated by the enhancement of electromyographic tongue potentials as evoked by single-pulse transcranial magnetic stimulation (TMS) over tongue motor cortex. This technique is, however, technically challenging and enables only a rather coarse measurement of this motor mirroring. Here, we applied TMS to listeners’ tongue motor area in association with ultrasound tissue Doppler imaging to describe fine-grained tongue kinematic synergies evoked by passive listening to speech. Subjects listened to syllables requiring different patterns of dorso-ventral and antero-posterior movements (/ki/, /ko/, /ti/, /to/). Results show that passive listening to speech sounds evokes a pattern of motor synergies mirroring those occurring during speech production. Moreover, mirror motor synergies were more evident in those subjects showing good performances in discriminating speech in noise demonstrating a role of the speech-related mirror system in feed-forward processing the speaker''s ongoing motor plan.  相似文献   

20.

The risk–disturbance hypothesis states that animals react to human stressors in the same way as they do to natural predators. Given increasing human–wildlife contact, understanding whether animals perceive anthropogenic sounds as a threat is important for assessing the long-term sustainability of wildlife tourism and proposing appropriate mitigation strategies. A study of pygmy marmoset (Cebuella niveiventris) responses to human speech found marmosets fled, decreased feeding and resting, and increased alert behaviors in response to human speech. Following this study, we investigated pygmy marmoset reactions to playbacks of different acoustic stimuli: controls (no playback, white noise and cicadas), anthropogenic noise (human speech and motorboats), and avian predators. For each playback condition, we recorded the behavior of a marmoset and looked at how the behaviors changed during and after the playback relative to behaviors before. We repeated this on ten different marmoset groups, playing each condition once to each group. The results did not replicate a previous study on the same species, at the same site, demonstrating the importance of replication in primate research, particularly when results are used to inform conservation policy. The results showed increased scanning during playbacks of the cicadas and predators compared with before the playback, and an increase in resting after playbacks of avian predators, but no evidence of behavior change in response to playbacks of human speech. There was no effect of ambient sound levels or distance between the playback source and focal animals on their behavior for all playback conditions. Although we find that noise can change the behavior of pygmy marmosets, we did not find evidence to support the risk–disturbance hypothesis.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号