首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The purpose of the present study was to determine whether different cues to increase loudness in speech result in different internal targets (or goals) for respiratory movement and whether the neural control of the respiratory system is sensitive to changes in the speaker's internal loudness target. This study examined respiratory mechanisms during speech in 30 young adults at comfortable level and increased loudness levels. Increased loudness was elicited using three methods: asking subjects to target a specific sound pressure level, asking subjects to speak twice as loud as comfortable, and asking subjects to speak in noise. All three loud conditions resulted in similar increases in sound pressure level . However, the respiratory mechanisms used to support the increase in loudness differed significantly depending on how the louder speech was elicited. When asked to target at a particular sound pressure level, subjects used a mechanism of increasing the lung volume at which speech was initiated to take advantage of higher recoil pressures. When asked to speak twice as loud as comfortable, subjects increased expiratory muscle tension, for the most part, to increase the pressure for speech. However, in the most natural of the elicitation methods, speaking in noise, the subjects used a combined respiratory approach, using both increased recoil pressures and increased expiratory muscle tension. In noise, an additional target, possibly improving intelligibility of speech, was reflected in the slowing of speech rate and in larger volume excursions even though the speakers were producing the same number of syllables.  相似文献   

2.
It was found that, at a test bandwidth range of 50 Hz, 100% speech intelligibility is retained in naive subjects when, on average, 950 Hz is removed from each subsequent 1000-Hz bandwidth. Thus, speech is 95% redundant with respect to the spectral content. The parameters of the comb filter were chosen from measurements of speech intelligibility in experienced subjects, at which no one subject with normal hearing taking part in the experiment for the first time exhibited 100% intelligibility. Two methods of learning to perceive spectrally deprived speech signals are compared: (1) aurally only and (2) with visual enhancement. In the latter case, speech intelligibility is significantly higher. The possibility of using a spectrally deprived speech signal to develop and assess the efficiency of auditory rehabilitation of implanted patients is discussed.  相似文献   

3.
Nucleus cochlear implant systems incorporate a fast-acting front-end automatic gain control (AGC), sometimes called a compression limiter. The objective of the present study was to determine the effect of replacing the front-end compression limiter with a newly proposed envelope profile limiter. A secondary objective was to investigate the effect of AGC speed on cochlear implant speech intelligibility. The envelope profile limiter was located after the filter bank and reduced the gain when the largest of the filter bank envelopes exceeded the compression threshold. The compression threshold was set equal to the saturation level of the loudness growth function (i.e. the envelope level that mapped to the maximum comfortable current level), ensuring that no envelope clipping occurred. To preserve the spectral profile, the same gain was applied to all channels. Experiment 1 compared sentence recognition with the front-end limiter and with the envelope profile limiter, each with two release times (75 and 625 ms). Six implant recipients were tested in quiet and in four-talker babble noise, at a high presentation level of 89 dB SPL. Overall, release time had a larger effect than the AGC type. With both AGC types, speech intelligibility was lower for the 75 ms release time than for the 625 ms release time. With the shorter release time, the envelope profile limiter provided higher group mean scores than the front-end limiter in quiet, but there was no significant difference in noise. Experiment 2 measured sentence recognition in noise as a function of presentation level, from 55 to 89 dB SPL. The envelope profile limiter with 625 ms release time yielded better scores than the front-end limiter with 75 ms release time. A take-home study showed no clear pattern of preferences. It is concluded that the envelope profile limiter is a feasible alternative to a front-end compression limiter.  相似文献   

4.
Real-world sounds like speech or traffic noise typically exhibit spectro-temporal variability because the energy in different spectral regions evolves differently as a sound unfolds in time. However, it is currently not well understood how the energy in different spectral and temporal portions contributes to loudness. This study investigated how listeners weight different temporal and spectral components of a sound when judging its overall loudness. Spectral weights were measured for the combination of three loudness-matched narrowband noises with different center frequencies. To measure temporal weights, 1,020-ms stimuli were presented, which randomly changed in level every 100 ms. Temporal weights were measured for each narrowband noise separately, and for a broadband noise containing the combination of the three noise bands. Finally, spectro-temporal weights were measured with stimuli where the level of the three narrowband noises randomly and independently changed every 100 ms. The data consistently showed that (i) the first 300 ms of the sounds had a greater influence on overall loudness perception than later temporal portions (primacy effect), and (ii) the lowest noise band contributed significantly more to overall loudness than the higher bands. The temporal weights did not differ between the three frequency bands. Notably, the spectral weights and temporal weights estimated from the conditions with only spectral or only temporal variability were very similar to the corresponding weights estimated in the spectro-temporal condition. The results indicate that the temporal and the spectral weighting of the loudness of a time-varying sound are independent processes. The spectral weights remain constant across time, and the temporal weights do not change across frequency. The results are discussed in the context of current loudness models.  相似文献   

5.
IntroductionDynamic MRI analysis of phonation has gathered interest in voice and speech physiology. However, there are limited data addressing the extent to which articulation is dependent on loudness.ResultsThe data show articulatory differences with respect to changes of both pitch and loudness. Here, lip opening and pharynx width were increased. While the vertical larynx position was rising with pitch it was lower for greater loudness. Especially, the lip opening and pharynx width were more strongly correlated with the sound pressure level than with pitch.ConclusionFor the vowel /a/ loudness has an effect on articulation during singing which should be considered when articulatory vocal tract data are interpreted.  相似文献   

6.
Reverberation is known to reduce the temporal envelope modulations present in the signal and affect the shape of the modulation spectrum. A non-intrusive intelligibility measure for reverberant speech is proposed motivated by the fact that the area of the modulation spectrum decreases with increasing reverberation. The proposed measure is based on the average modulation area computed across four acoustic frequency bands spanning the signal bandwidth. High correlations (r = 0.98) were observed with sentence intelligibility scores obtained by cochlear implant listeners. Proposed measure outperformed other measures including an intrusive speech-transmission index based measure.  相似文献   

7.
We systematically determined which spectrotemporal modulations in speech are necessary for comprehension by human listeners. Speech comprehension has been shown to be robust to spectral and temporal degradations, but the specific relevance of particular degradations is arguable due to the complexity of the joint spectral and temporal information in the speech signal. We applied a novel modulation filtering technique to recorded sentences to restrict acoustic information quantitatively and to obtain a joint spectrotemporal modulation transfer function for speech comprehension, the speech MTF. For American English, the speech MTF showed the criticality of low modulation frequencies in both time and frequency. Comprehension was significantly impaired when temporal modulations <12 Hz or spectral modulations <4 cycles/kHz were removed. More specifically, the MTF was bandpass in temporal modulations and low-pass in spectral modulations: temporal modulations from 1 to 7 Hz and spectral modulations <1 cycles/kHz were the most important. We evaluated the importance of spectrotemporal modulations for vocal gender identification and found a different region of interest: removing spectral modulations between 3 and 7 cycles/kHz significantly increases gender misidentifications of female speakers. The determination of the speech MTF furnishes an additional method for producing speech signals with reduced bandwidth but high intelligibility. Such compression could be used for audio applications such as file compression or noise removal and for clinical applications such as signal processing for cochlear implants.  相似文献   

8.
Twenty-three children with Down's syndrome, aged between 3.7 and 17.5 years, underwent partial glossectomy for improvement of cosmetic appearance. Improved speech was also expected. Preoperative and postoperative audiotaped samples of spoken words and connected speech on a standardized articulation test were rated by three lay and three expert listeners on a five-point intelligibility scale. Five subjects were eliminated from both tasks and another four from connected-speech testing because of inability to complete the experimental tasks. Statistical analyses of ratings for words in 18 subjects and connected speech in 14 of them revealed no significant difference in acoustic speech intelligibility preoperatively and postoperatively. The findings suggest that a wedge-excision partial glossectomy in children with Down's syndrome does not result in significant improvement in acoustic speech intelligibility; in some patients, however, there may be an aesthetic improvement during speech.  相似文献   

9.
Luo H  Poeppel D 《Neuron》2007,54(6):1001-1010
How natural speech is represented in the auditory cortex constitutes a major challenge for cognitive neuroscience. Although many single-unit and neuroimaging studies have yielded valuable insights about the processing of speech and matched complex sounds, the mechanisms underlying the analysis of speech dynamics in human auditory cortex remain largely unknown. Here, we show that the phase pattern of theta band (4-8 Hz) responses recorded from human auditory cortex with magnetoencephalography (MEG) reliably tracks and discriminates spoken sentences and that this discrimination ability is correlated with speech intelligibility. The findings suggest that an approximately 200 ms temporal window (period of theta oscillation) segments the incoming speech signal, resetting and sliding to track speech dynamics. This hypothesized mechanism for cortical speech analysis is based on the stimulus-induced modulation of inherent cortical rhythms and provides further evidence implicating the syllable as a computational primitive for the representation of spoken language.  相似文献   

10.
Elucidating the structure and function of joint vocal displays (e.g. duet, chorus) recorded with a conventional microphone has proved difficult in some animals owing to the complex acoustic properties of the combined signal, a problem reminiscent of multi-speaker conversations in humans. Towards this goal, we set out to simultaneously compare air-transmitted (AT) with radio-transmitted (RT) vocalizations in one pair of humans and one pair of captive Bolivian grey titi monkeys (Plecturocebus donacophilus) all equipped with an accelerometer – or vibration transducer – closely apposed to the larynx. First, we observed no crosstalk between the two radio transmitters when subjects produced vocalizations at the same time close to each other. Second, compared with AT acoustic recordings, sound segmentation and pitch tracking of the RT signal was more accurate, particularly in a noisy and reverberating environment. Third, RT signals were less noisy than AT signals and displayed more stable amplitude regardless of distance, orientation and environment of the animal. The microphone outperformed the accelerometer with respect to sound spectral bandwidth and speech intelligibility: the sounds of RT speech were more attenuated and dampened as compared to AT speech. Importantly, we show that vocal telemetry allows reliable separation of the subjects’ voices during production of joint vocalizations, which has great potential for future applications of this technique with free-ranging animals.  相似文献   

11.

Background

The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding ‘rapid temporal processing’.

Methodology

A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech) which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET) was used to compare which brain regions were active when participants listened to the different sounds.

Conclusions

Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible) was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features.  相似文献   

12.
Numerous speech processing techniques have been applied to assist hearing-impaired subjects with extreme high-frequency hearing losses who can be helped only to a limited degree with conventional hearing aids. The results of providing this class of deaf subjects with a speech encoding hearing aid, which is able to reproduce intelligible speech for their particular needs, have generally been disappointing. There are at least four problems related to bandwidth compression applied to the voiced portion of speech: (1) the problem of pitch extraction in real time; (2) pitch extraction under realistic listening conditions, i.e. when competing speech and noise sources are present; (3) an insufficient data base for successful compression of voiced speech; and (4) the introduction of undesirable spectral energies in the bandwidth-compressed signal, due to the compression process itself. Experiments seem to indicate that voiced speech segments bandwidth limited to f = 1000 Hz, even at a loss of higher formant frequencies, is in most instances superior in intelligibility compared to bandwidth-compressed voiced speech segments of the same bandwidth, even if pitch can be extracted with no error. With the added complexity of real-time pitch extraction which has to function in actual listening conditions, it is doubtful that a speech encoding hearing aid, based on bandwidth compression on the voiced portion of speech, could be successfully implemented. However, if bandwidth compression is applied to the unvoiced portions of speech only, the above limitations can be overcome (1).  相似文献   

13.
How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.  相似文献   

14.
Quantification of the mucosa oxygenation levels during Endoscopic imaging provides useful physiological/diagnostic information. In this work a method for non‐contact quantification of the oxygen saturation index during Endoscopic imaging using three discrete spectral‐band in the blue, the green, and the red parts of the spectrum (RGB bands) has been investigated. The oxygen saturation index (TOI_rgb) was calculated from the three discrete RGB spectral bands using diffusion approximation modeling and least‐square analysis. A parametric study performed to identify the optimum band width for each of the three spectral bands. The quantification algorithm was applied to in vivo images of the endobronchial mucosa to calculate (TOI_rgb) from selected areas within the image view. The results were compared to that obtained from the full visible spectral (470–700 nm, 10 nm) measurements. The analysis showed that a band width of at least 20 nm in the blue and the green is required to obtain best results. The results showed that the method provides accurate estimation of the oxygenation levels with about 90% accuracy compared to that obtained using the full spectra. The results suggest the potential of quantifying the oxygen saturation levels from the three narrow RGB spectral bands/images. (© 2009 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

15.
Neuhofer D  Ronacher B 《PloS one》2012,7(3):e34384

Background

Animals that communicate by sound face the problem that the signals arriving at the receiver often are degraded and masked by noise. Frequency filters in the receiver''s auditory system may improve the signal-to-noise ratio (SNR) by excluding parts of the spectrum which are not occupied by the species-specific signals. This solution, however, is hardly amenable to species that produce broad band signals or have ears with broad frequency tuning. In mammals auditory filters exist that work in the temporal domain of amplitude modulations (AM). Do insects also use this type of filtering?

Principal Findings

Combining behavioural and neurophysiological experiments we investigated whether AM filters may improve the recognition of masked communication signals in grasshoppers. The AM pattern of the sound, its envelope, is crucial for signal recognition in these animals. We degraded the species-specific song by adding random fluctuations to its envelope. Six noise bands were used that differed in their overlap with the spectral content of the song envelope. If AM filters contribute to reduced masking, signal recognition should depend on the degree of overlap between the song envelope spectrum and the noise spectra. Contrary to this prediction, the resistance against signal degradation was the same for five of six masker bands. Most remarkably, the band with the strongest frequency overlap to the natural song envelope (0–100 Hz) impaired acceptance of degraded signals the least. To assess the noise filter capacities of single auditory neurons, the changes of spike trains as a function of the masking level were assessed. Increasing levels of signal degradation in different frequency bands led to similar changes in the spike trains in most neurones.

Conclusions

There is no indication that auditory neurones of grasshoppers are specialized to improve the SNR with respect to the pattern of amplitude modulations.  相似文献   

16.
The effects of neuraminidase treatment on the electrophoretic pattern of alkaline phosphatase (AP) isozymes and AP activity were investigated in chicken plasma. AP comprised three isozymes. The zymogram of an individual chicken plasma had two bands, either the faster (F) or the slower (S) moving band by isozyme types and the B band irrespective of isozyme types. Mobility of the S band and AP activity in chicken plasma were not affected by neuraminidase treatment. The treatment has a reduced migration rate of the F band equal to that of the S band and the B band of both types closer to the origin. The genetic control of these bands is discussed.  相似文献   

17.
The effects of neuraminidase treatment on the electrophoretic pattern of alkaline phosphatase (AP) isozymes and AP activity were investigated in chicken plasma. AP comprised three isozymes. The zymogram of an individual chicken plasma had two bands, either the faster (F) or the slower (S) moving band by isozyme types and the B band irrespective of isozyme types. Mobility of the S band and AP activity in chicken plasma were not affected by neuraminidase treatment. The treatment has a reduced migration rate of the F band equal to that of the S band and the B band of both types closer to the origin. The genetic control of these bands is discussed.  相似文献   

18.
In many animal species, male acoustic signals serve to attract a mate and therefore often play a major role for male mating success. Male body condition is likely to be correlated with male acoustic signal traits, which signal male quality and provide choosy females indirect benefits. Environmental factors such as food quantity or quality can influence male body condition and therefore possibly lead to condition-dependent changes in the attractiveness of acoustic signals. Here, we test whether stressing food plants influences acoustic signal traits of males via condition-dependent expression of these traits. We examined four male song characteristics, which are vital for mate choice in females of the grasshopper Chorthippus biguttulus. Only one of the examined acoustic traits, loudness, was significantly altered by changing body condition because of drought- and moisture-related stress of food plants. No condition dependence could be observed for syllable to pause ratio, gap duration within syllables, and onset accentuation. We suggest that food plant stress and therefore food plant quality led to shifts in loudness of male grasshopper songs via body condition changes. The other three examined acoustic traits of males do not reflect male body condition induced by food plant quality.  相似文献   

19.
长江航运业的快速发展导致长江中船舶数量激增,相应的水体噪声污染可能对同水域的长江江豚(Neophocaena asiaeorientalis asiaeorientalis)产生一定的负面影响,本研究采用宽频录音设备对长江和畅洲北汊非正式通航江段的各类常见大型船舶(长>15m且宽>5m)的航行噪声进行了记录,并分析其峰值-峰值声压级强度(SPLp-p)和功率谱密度(PSD)等。结果表明,大型船舶的航行噪声能量分布频率范围较广(>100kHz),但主要集中于中低频(<10kHz)部分,各频率(20Hz~144kHz)处的均方根声压级(SPLrms)对环境背景噪声在该频率处的噪声增量范围为3.7~66.5dB。接收到的1/3倍频程声压级(TOL)在各频率处都大于70dB,在8~140kHz频段内都高于长江江豚的听觉阈值。说明大型船舶的航行噪声可能会对长江江豚个体间的声通讯及听觉带来不利影响,如听觉掩盖。  相似文献   

20.

Objectives

(1) To report the speech perception and intelligibility results of Mandarin-speaking patients with large vestibular aqueduct syndrome (LVAS) after cochlear implantation (CI); (2) to compare their performance with a group of CI users without LVAS; (3) to understand the effects of age at implantation and duration of implant use on the CI outcomes. The obtained data may be used to guide decisions about CI candidacy and surgical timing.

Methods

Forty-two patients with LVAS participating in this study were divided into two groups: the early group received CI before 5 years of age and the late group after 5. Open-set speech perception tests (on Mandarin tones, words and sentences) were administered one year after implantation and at the most recent follow-up visit. Categories of auditory perception (CAP) and Speech Intelligibility Rating (SIR) scale scores were also obtained.

Results

The patients with LVAS with more than 5 years of implant use (18 cases) achieved a mean score higher than 80% on the most recent speech perception tests and reached the highest level on the CAP/SIR scales. The early group developed speech perception and intelligibility steadily over time, while the late group had a rapid improvement during the first year after implantation. The two groups, regardless of their age at implantation, reached a similar performance level at the most recent follow-up visit.

Conclusion

High levels of speech performance are reached after 5 years of implant use in patients with LVAS. These patients do not necessarily need to wait until their hearing thresholds are higher than 90 dB HL or PB word score lower than 40% to receive CI. They can do it “earlier” when their speech perception and/or speech intelligibility do not reach the performance level suggested in this study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号