首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 890 毫秒
1.
Nonnative speech poses a challenge to speech perception, especially in challenging listening environments. Audiovisual (AV) cues are known to improve native speech perception in noise. The extent to which AV cues benefit nonnative speech perception in noise, however, is much less well-understood. Here, we examined native American English-speaking and native Korean-speaking listeners'' perception of English sentences produced by a native American English speaker and a native Korean speaker across a range of signal-to-noise ratios (SNRs;−4 to −20 dB) in audio-only and audiovisual conditions. We employed psychometric function analyses to characterize the pattern of AV benefit across SNRs. For native English speech, the largest AV benefit occurred at intermediate SNR (i.e. −12 dB); but for nonnative English speech, the largest AV benefit occurred at a higher SNR (−4 dB). The psychometric function analyses demonstrated that the AV benefit patterns were different between native and nonnative English speech. The nativeness of the listener exerted negligible effects on the AV benefit across SNRs. However, the nonnative listeners'' ability to gain AV benefit in native English speech was related to their proficiency in English. These findings suggest that the native language background of both the speaker and listener clearly modulate the optimal use of AV cues in speech recognition.  相似文献   

2.
For deaf individuals with residual low-frequency acoustic hearing, combined use of a cochlear implant (CI) and hearing aid (HA) typically provides better speech understanding than with either device alone. Because of coarse spectral resolution, CIs do not provide fundamental frequency (F0) information that contributes to understanding of tonal languages such as Mandarin Chinese. The HA can provide good representation of F0 and, depending on the range of aided acoustic hearing, first and second formant (F1 and F2) information. In this study, Mandarin tone, vowel, and consonant recognition in quiet and noise was measured in 12 adult Mandarin-speaking bimodal listeners with the CI-only and with the CI+HA. Tone recognition was significantly better with the CI+HA in noise, but not in quiet. Vowel recognition was significantly better with the CI+HA in quiet, but not in noise. There was no significant difference in consonant recognition between the CI-only and the CI+HA in quiet or in noise. There was a wide range in bimodal benefit, with improvements often greater than 20 percentage points in some tests and conditions. The bimodal benefit was compared to CI subjects’ HA-aided pure-tone average (PTA) thresholds between 250 and 2000 Hz; subjects were divided into two groups: “better” PTA (<50 dB HL) or “poorer” PTA (>50 dB HL). The bimodal benefit differed significantly between groups only for consonant recognition. The bimodal benefit for tone recognition in quiet was significantly correlated with CI experience, suggesting that bimodal CI users learn to better combine low-frequency spectro-temporal information from acoustic hearing with temporal envelope information from electric hearing. Given the small number of subjects in this study (n = 12), further research with Chinese bimodal listeners may provide more information regarding the contribution of acoustic and electric hearing to tonal language perception.  相似文献   

3.

Objective

To investigate a set of acoustic features and classification methods for the classification of three groups of fricative consonants differing in place of articulation.

Method

A support vector machine (SVM) algorithm was used to classify the fricatives extracted from the TIMIT database in quiet and also in speech babble noise at various signal-to-noise ratios (SNRs). Spectral features including four spectral moments, peak, slope, Mel-frequency cepstral coefficients (MFCC), Gammatone filters outputs, and magnitudes of fast Fourier Transform (FFT) spectrum were used for the classification. The analysis frame was restricted to only 8 msec. In addition, commonly-used linear and nonlinear principal component analysis dimensionality reduction techniques that project a high-dimensional feature vector onto a lower dimensional space were examined.

Results

With 13 MFCC coefficients, 14 or 24 Gammatone filter outputs, classification performance was greater than or equal to 85% in quiet and at +10 dB SNR. Using 14 Gammatone filter outputs above 1 kHz, classification accuracy remained high (greater than 80%) for a wide range of SNRs from +20 to +5 dB SNR.

Conclusions

High levels of classification accuracy for fricative consonants in quiet and in noise could be achieved using only spectral features extracted from a short time window. Results of this work have a direct impact on the development of speech enhancement algorithms for hearing devices.  相似文献   

4.
In utero RNAi of the dyslexia-associated gene Kiaa0319 in rats (KIA-) degrades cortical responses to speech sounds and increases trial-by-trial variability in onset latency. We tested the hypothesis that KIA- rats would be impaired at speech sound discrimination. KIA- rats needed twice as much training in quiet conditions to perform at control levels and remained impaired at several speech tasks. Focused training using truncated speech sounds was able to normalize speech discrimination in quiet and background noise conditions. Training also normalized trial-by-trial neural variability and temporal phase locking. Cortical activity from speech trained KIA- rats was sufficient to accurately discriminate between similar consonant sounds. These results provide the first direct evidence that assumed reduced expression of the dyslexia-associated gene KIAA0319 can cause phoneme processing impairments similar to those seen in dyslexia and that intensive behavioral therapy can eliminate these impairments.  相似文献   

5.
The effect of internal noise in a delayed circadian oscillator is studied by using both chemical Langevin equations and stochastic normal form theory. It is found that internal noise can induce circadian oscillation even if the delay time τ is below the deterministic Hopf bifurcation τh. We use signal-to-noise ratio (SNR) to quantitatively characterize the performance of such noise induced oscillations and a threshold value of SNR is introduced to define the so-called effective oscillation. Interestingly, the τ-range for effective stochastic oscillation, denoted as ΔτEO, shows a bell-shaped dependence on the intensity of internal noise which is inversely proportional to the system size. We have also investigated how the rates of synthesis and degradation of the clock protein influence the SNR and thus ΔτEO. The decay rate Kd could significantly affect ΔτEO, while varying the gene expression rate Ke has no obvious effect if Ke is not too small. Stochastic normal form analysis and numerical simulations are in good consistency with each other. This work provides us comprehensive understandings of how internal noise and time delay work cooperatively to influence the dynamics of circadian oscillations.  相似文献   

6.

Background

Improvement of the cochlear implant (CI) front-end signal acquisition is needed to increase speech recognition in noisy environments. To suppress the directional noise, we introduce a speech-enhancement algorithm based on microphone array beamforming and spectral estimation. The experimental results indicate that this method is robust to directional mobile noise and strongly enhances the desired speech, thereby improving the performance of CI devices in a noisy environment.

Methods

The spectrum estimation and the array beamforming methods were combined to suppress the ambient noise. The directivity coefficient was estimated in the noise-only intervals, and was updated to fit for the mobile noise.

Results

The proposed algorithm was realized in the CI speech strategy. For actual parameters, we use Maxflat filter to obtain fractional sampling points and cepstrum method to differentiate the desired speech frame and the noise frame. The broadband adjustment coefficients were added to compensate the energy loss in the low frequency band.

Discussions

The approximation of the directivity coefficient is tested and the errors are discussed. We also analyze the algorithm constraint for noise estimation and distortion in CI processing. The performance of the proposed algorithm is analyzed and further be compared with other prevalent methods.

Conclusions

The hardware platform was constructed for the experiments. The speech-enhancement results showed that our algorithm can suppresses the non-stationary noise with high SNR. Excellent performance of the proposed algorithm was obtained in the speech enhancement experiments and mobile testing. And signal distortion results indicate that this algorithm is robust with high SNR improvement and low speech distortion.  相似文献   

7.
8.
The aim of the investigation was to study if dysfunctions associated to the cochlea or its regulatory system can be found, and possibly explain hearing problems in subjects with normal or near-normal audiograms. The design was a prospective study of subjects recruited from the general population. The included subjects were persons with auditory problems who had normal, or near-normal, pure tone hearing thresholds, who could be included in one of three subgroups: teachers, Education; people working with music, Music; and people with moderate or negligible noise exposure, Other. A fourth group included people with poorer pure tone hearing thresholds and a history of severe occupational noise, Industry. Ntotal = 193. The following hearing tests were used:− pure tone audiometry with Békésy technique,− transient evoked otoacoustic emissions and distortion product otoacoustic emissions, without and with contralateral noise;− psychoacoustical modulation transfer function,− forward masking,− speech recognition in noise,− tinnitus matching.A questionnaire about occupations, noise exposure, stress/anxiety, muscular problems, medication, and heredity, was addressed to the participants. Forward masking results were significantly worse for Education and Industry than for the other groups, possibly associated to the inner hair cell area. Forward masking results were significantly correlated to louder matched tinnitus. For many subjects speech recognition in noise, left ear, did not increase in a normal way when the listening level was increased. Subjects hypersensitive to loud sound had significantly better speech recognition in noise at the lower test level than subjects not hypersensitive. Self-reported stress/anxiety was similar for all groups. In conclusion, hearing dysfunctions were found in subjects with tinnitus and other auditory problems, combined with normal or near-normal pure tone thresholds. The teachers, mostly regarded as a group exposed to noise below risk levels, had dysfunctions almost identical to those of the more exposed Industry group.  相似文献   

9.
The listener-distinctive features of recognition of different emotional intonations (positive, negative and neutral) of male and female speakers in the presence or absence of background noise were studied in 49 adults aged 20-79 years. In all the listeners noise produced the most pronounced decrease in recognition accuracy for positive emotional intonation ("joy") as compared to other intonations, whereas it did not influence the recognition accuracy of "anger" in 65-79-year-old listeners. The higher emotion recognition rates of a noisy signal were observed for speech emotional intonations expressed by female speakers. Acoustic characteristics of noisy and clear speech signals underlying perception of speech emotional prosody were found for adult listeners of different age and gender.  相似文献   

10.
A learner’s linguistic input is more variable if it comes from a greater number of speakers. Higher speaker input variability has been shown to facilitate the acquisition of phonemic boundaries, since data drawn from multiple speakers provides more information about the distribution of phonemes in a speech community. It has also been proposed that speaker input variability may have a systematic influence on individual-level learning of morphology, which can in turn influence the group-level characteristics of a language. Languages spoken by larger groups of people have less complex morphology than those spoken in smaller communities. While a mechanism by which the number of speakers could have such an effect is yet to be convincingly identified, differences in speaker input variability, which is thought to be larger in larger groups, may provide an explanation. By hindering the acquisition, and hence faithful cross-generational transfer, of complex morphology, higher speaker input variability may result in structural simplification. We assess this claim in two experiments which investigate the effect of such variability on language learning, considering its influence on a learner’s ability to segment a continuous speech stream and acquire a morphologically complex miniature language. We ultimately find no evidence to support the proposal that speaker input variability influences language learning and so cannot support the hypothesis that it explains how population size determines the structural properties of language.  相似文献   

11.
Extensive research shows that inter-talker variability (i.e., changing the talker) affects recognition memory for speech signals. However, relatively little is known about the consequences of intra-talker variability (i.e. changes in speaking style within a talker) on the encoding of speech signals in memory. It is well established that speakers can modulate the characteristics of their own speech and produce a listener-oriented, intelligibility-enhancing speaking style in response to communication demands (e.g., when speaking to listeners with hearing impairment or non-native speakers of the language). Here we conducted two experiments to examine the role of speaking style variation in spoken language processing. First, we examined the extent to which clear speech provided benefits in challenging listening environments (i.e. speech-in-noise). Second, we compared recognition memory for sentences produced in conversational and clear speaking styles. In both experiments, semantically normal and anomalous sentences were included to investigate the role of higher-level linguistic information in the processing of speaking style variability. The results show that acoustic-phonetic modifications implemented in listener-oriented speech lead to improved speech recognition in challenging listening conditions and, crucially, to a substantial enhancement in recognition memory for sentences.  相似文献   

12.
This study aimed to characterize the linguistic interference that occurs during speech-in-speech comprehension by combining offline and online measures, which included an intelligibility task (at a −5 dB Signal-to-Noise Ratio) and 2 lexical decision tasks (at a −5 dB and 0 dB SNR) that were performed with French spoken target words. In these 3 experiments we always compared the masking effects of speech backgrounds (i.e., 4-talker babble) that were produced in the same language as the target language (i.e., French) or in unknown foreign languages (i.e., Irish and Italian) to the masking effects of corresponding non-speech backgrounds (i.e., speech-derived fluctuating noise). The fluctuating noise contained similar spectro-temporal information as babble but lacked linguistic information. At −5 dB SNR, both tasks revealed significantly divergent results between the unknown languages (i.e., Irish and Italian) with Italian and French hindering French target word identification to a similar extent, whereas Irish led to significantly better performances on these tasks. By comparing the performances obtained with speech and fluctuating noise backgrounds, we were able to evaluate the effect of each language. The intelligibility task showed a significant difference between babble and fluctuating noise for French, Irish and Italian, suggesting acoustic and linguistic effects for each language. However, the lexical decision task, which reduces the effect of post-lexical interference, appeared to be more accurate, as it only revealed a linguistic effect for French. Thus, although French and Italian had equivalent masking effects on French word identification, the nature of their interference was different. This finding suggests that the differences observed between the masking effects of Italian and Irish can be explained at an acoustic level but not at a linguistic level.  相似文献   

13.

Introduction

Our objective was to determine rheumatoid arthritis (RA) patients’ understanding of methotrexate and assess whether knowledge varies by age, education, English language proficiency, or other disease-related factors.

Methods

Adults with RA (n = 135) who were enrollees of an observational cohort completed a structured telephone interview in their preferred language between August 2007 and July 2009. All subjects who reported taking methotrexate were asked 11 questions about the medication in addition to demographics, education level, and language proficiency. Primary outcome was a total score below the 50th percentile (considered inadequate methotrexate knowledge). Bivariable and multivariable logistic regressions were performed. Covariates included demographics, language proficiency, education, and disease characteristics.

Results

Of 135 subjects, 83% were female, with a mean age of 55 ± 14 years. The majority spoke English (64%), followed by 22% Spanish and 14% Cantonese or Mandarin. Limited English language proficiency (LEP) was reported in 42%. Mean methotrexate knowledge score was 5.4 ± 2.6 (range, 0 to 10); 73 (54%) had a score lower than 5 (of 10). Age older than 55, less than high school education, LEP, better function, and biologic use were independently associated with poor knowledge.

Conclusions

In a diverse RA cohort, overall methotrexate knowledge was poor. Older age and limited proficiency in English were significant correlates of poor knowledge. Identification of language barriers and improved clinician-patient communication around methotrexate dosing and side effects may lead to improved safety and enhanced benefits of this commonly used RA medication.  相似文献   

14.
Beat gestures—spontaneously produced biphasic movements of the hand—are among the most frequently encountered co-speech gestures in human communication. They are closely temporally aligned to the prosodic characteristics of the speech signal, typically occurring on lexically stressed syllables. Despite their prevalence across speakers of the world''s languages, how beat gestures impact spoken word recognition is unclear. Can these simple ‘flicks of the hand'' influence speech perception? Across a range of experiments, we demonstrate that beat gestures influence the explicit and implicit perception of lexical stress (e.g. distinguishing OBject from obJECT), and in turn can influence what vowels listeners hear. Thus, we provide converging evidence for a manual McGurk effect: relatively simple and widely occurring hand movements influence which speech sounds we hear.  相似文献   

15.
BackgroundA number of prior studies have demonstrated that research participants with limited English proficiency in the United States are routinely excluded from clinical trial participation. Systematic exclusion through study eligibility criteria that require trial participants to be able to speak, read, and/or understand English affects access to clinical trials and scientific generalizability. We sought to establish the frequency with which English language proficiency is required and, conversely, when non-English languages are affirmatively accommodated in US interventional clinical trials for adult populations.Methods and findingsWe used the advanced search function on ClinicalTrials.gov specifying interventional studies for adults with at least 1 site in the US. In addition, we used these search criteria to find studies with an available posted protocol. A computer program was written to search for evidence of English or Spanish language requirements, or the posted protocol, when available, was manually read for these language requirements. Of the 14,367 clinical trials registered on ClinicalTrials.gov between 1 January 2019 and 1 December 2020 that met baseline search criteria, 18.98% (95% CI 18.34%–19.62%; n = 2,727) required the ability to read, speak, and/or understand English, and 2.71% (95% CI 2.45%–2.98%; n = 390) specifically mentioned accommodation of translation to another language. The remaining trials in this analysis and the following sub-analyses did not mention English language requirements or accommodation of languages other than English. Of 2,585 federally funded clinical trials, 28.86% (95% CI 27.11%–30.61%; n = 746) required English language proficiency and 4.68% (95% CI 3.87%–5.50%; n = 121) specified accommodation of other languages; of the 5,286 industry-funded trials, 5.30% (95% CI 4.69%–5.90%; n = 280) required English and 0.49% (95% CI 0.30%–0.69%; n = 26) accommodated other languages. Trials related to infectious disease were less likely to specify an English requirement than all registered trials (10.07% versus 18.98%; relative risk [RR] = 0.53; 95% CI 0.44–0.64; p < 0.001). Trials related to COVID-19 were also less likely to specify an English requirement than all registered trials (8.18% versus 18.98%; RR = 0.43; 95% CI 0.33–0.56; p < 0.001). Trials with a posted protocol (n = 366) were more likely than all registered clinical trials to specify an English requirement (36.89% versus 18.98%; RR = 1.94, 95% CI 1.69–2.23; p < 0.001). A separate analysis of studies with posted protocols in 4 therapeutic areas (depression, diabetes, breast cancer, and prostate cancer) demonstrated that clinical trials related to depression were the most likely to require English (52.24%; 95% CI 40.28%–64.20%). One limitation of this study is that the computer program only searched for the terms “English” and “Spanish” and may have missed evidence of other language accommodations. Another limitation is that we did not differentiate between requirements to read English, speak English, understand English, and be a native English speaker; we grouped these requirements together in the category of English language requirements.ConclusionsA meaningful percentage of US interventional clinical trials for adults exclude individuals who cannot read, speak, and/or understand English, or are not native English speakers. To advance more inclusive and generalizable research, funders, sponsors, institutions, investigators, institutional review boards, and others should prioritize translating study materials and eliminate language requirements unless justified either scientifically or ethically.

Akila Muthukumar and coauthors, systematically analyze ClinicalTrials.gov to evaluate the frequency of English language requirements in clinical trial eligibility criteria.  相似文献   

16.
Established linguistic theoretical frameworks propose that alphabetic language speakers use phonemes as phonological encoding units during speech production whereas Mandarin Chinese speakers use syllables. This framework was challenged by recent neural evidence of facilitation induced by overlapping initial phonemes, raising the possibility that phonemes also contribute to the phonological encoding process in Chinese. However, there is no evidence of non-initial phoneme involvement in Chinese phonological encoding among representative Chinese speakers, rendering the functional role of phonemes in spoken Chinese controversial. Here, we addressed this issue by systematically investigating the word-initial and non-initial phoneme repetition effect on the electrophysiological signal using a picture-naming priming task in which native Chinese speakers produced disyllabic word pairs. We found that overlapping phonemes in both the initial and non-initial position evoked more positive ERPs in the 180- to 300-ms interval, indicating position-invariant repetition facilitation effect during phonological encoding. Our findings thus revealed the fundamental role of phonemes as independent phonological encoding units in Mandarin Chinese.  相似文献   

17.
目的:探讨体外反搏联合言语训练治疗脑性瘫痪并语言发育迟缓儿童的临床疗效。方法:选择2015年12月至2017年12月在上海市儿童医院康复科普通门诊确诊的脑瘫并语言发育迟缓患儿52例,按照随机数字表法将其随机分为治疗组和对照组,每组26例。对照组仅给予言语训练治疗,治疗组给予体外反搏联合言语训练治,以4周1个疗程,两组均治疗3个疗程。治疗前后,采用中国康复研究中心汉语版儿童语言发育评定法s-s法、Gesell发育评分法评价和比较患儿言语发育商和认知发育商的变化。结果:治疗后,两组言语发育商和认知发育商均显著高于治疗前,且治疗组言语发育商和认知发育商均显著高于对照组,差异均有统计学意义(均P0.01)。结论:体外反搏联合言语训练较单纯言语训练可更有效改善脑瘫并语言发育迟缓患儿的言语发育和认知发育。  相似文献   

18.
How do bilingual interlocutors inhibit interference from the non-target language to achieve brain-to-brain information exchange in a task to simulate a bilingual speaker–listener interaction. In the current study, two electroencephalogram devices were employed to record pairs of participants’ performances in a joint language switching task. Twenty-eight (14 pairs) unbalanced Chinese–English bilinguals (L1 Chinese) were instructed to name pictures in the appropriate language according to the cue. The phase-amplitude coupling analysis was employed to reveal the large-scale brain network responsible for joint language control between interlocutors. We found that (1) speakers and listeners coordinately suppressed cross-language interference through cross-frequency coupling, as shown in the increased delta/theta phase-amplitude and delta/alpha phase-amplitude coupling when switching to L2 than switching to L1; (2) speakers and listeners were both able to simultaneously inhibit cross-person item-level interference which was demonstrated by stronger cross-frequency coupling in the cross person condition compared to the within person condition. These results indicate that current bilingual models (e.g., the inhibitory control model) should incorporate mechanisms that address inhibiting interference sourced in both language and person (i.e., cross-language and cross-person item-level interference) synchronously through joint language control in dynamic cross-language communication.  相似文献   

19.
ObjectivesPrevious studies investigating speech perception in noise have typically been conducted with static masker positions. The aim of this study was to investigate the effect of spatial separation of source and masker (spatial release from masking, SRM) in a moving masker setup and to evaluate the impact of adaptive beamforming in comparison with fixed directional microphones in cochlear implant (CI) users.DesignSpeech reception thresholds (SRT) were measured in S0N0 and in a moving masker setup (S0Nmove) in 12 normal hearing participants and 14 CI users (7 subjects bilateral, 7 bimodal with a hearing aid in the contralateral ear). Speech processor settings were a moderately directional microphone, a fixed beamformer, or an adaptive beamformer. The moving noise source was generated by means of wave field synthesis and was smoothly moved in a shape of a half-circle from one ear to the contralateral ear. Noise was presented in either of two conditions: continuous or modulated.ResultsSRTs in the S0Nmove setup were significantly improved compared to the S0N0 setup for both the normal hearing control group and the bilateral group in continuous noise, and for the control group in modulated noise. There was no effect of subject group. A significant effect of directional sensitivity was found in the S0Nmove setup. In the bilateral group, the adaptive beamformer achieved lower SRTs than the fixed beamformer setting. Adaptive beamforming improved SRT in both CI user groups substantially by about 3 dB (bimodal group) and 8 dB (bilateral group) depending on masker type.ConclusionsCI users showed SRM that was comparable to normal hearing subjects. In listening situations of everyday life with spatial separation of source and masker, directional microphones significantly improved speech perception with individual improvements of up to 15 dB SNR. Users of bilateral speech processors with both directional microphones obtained the highest benefit.  相似文献   

20.
Binaural hearing involves using information relating to the differences between the signals that arrive at the two ears, and it can make it easier to detect and recognize signals in a noisy environment. This phenomenon of binaural hearing is quantified in laboratory studies as the binaural masking-level difference (BMLD). Mandarin is one of the most commonly used languages, but there are no publication values of BMLD or BILD based on Mandarin tones. Therefore, this study investigated the BMLD and BILD of Mandarin tones. The BMLDs of Mandarin tone detection were measured based on the detection threshold differences for the four tones of the voiced vowels /i/ (i.e., /i1/, /i2/, /i3/, and /i4/) and /u/ (i.e., /u1/, /u2/, /u3/, and /u4/) in the presence of speech-spectrum noise when presented interaurally in phase (S0N0) and interaurally in antiphase (SπN0). The BILDs of Mandarin tone recognition in speech-spectrum noise were determined as the differences in the target-to-masker ratio (TMR) required for 50% correct tone recognitions between the S0N0 and SπN0 conditions. The detection thresholds for the four tones of /i/ and /u/ differed significantly (p<0.001) between the S0N0 and SπN0 conditions. The average detection thresholds of Mandarin tones were all lower in the SπN0 condition than in the S0N0 condition, and the BMLDs ranged from 7.3 to 11.5 dB. The TMR for 50% correct Mandarin tone recognitions differed significantly (p<0.001) between the S0N0 and SπN0 conditions, at –13.4 and –18.0 dB, respectively, with a mean BILD of 4.6 dB. The study showed that the thresholds of Mandarin tone detection and recognition in the presence of speech-spectrum noise are improved when phase inversion is applied to the target speech. The average BILDs of Mandarin tones are smaller than the average BMLDs of Mandarin tones.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号