期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Speech Recognition in Natural Background Noise

Julien Meyer Laure Dentel Fanny Meunier 《PloS one》2013,8(11)

In the real world, human speech recognition nearly always involves listening in background noise. The impact of such noise on speech signals and on intelligibility performance increases with the separation of the listener from the speaker. The present behavioral experiment provides an overview of the effects of such acoustic disturbances on speech perception in conditions approaching ecologically valid contexts. We analysed the intelligibility loss in spoken word lists with increasing listener-to-speaker distance in a typical low-level natural background noise. The noise was combined with the simple spherical amplitude attenuation due to distance, basically changing the signal-to-noise ratio (SNR). Therefore, our study draws attention to some of the most basic environmental constraints that have pervaded spoken communication throughout human history. We evaluated the ability of native French participants to recognize French monosyllabic words (spoken at 65.3 dB(A), reference at 1 meter) at distances between 11 to 33 meters, which corresponded to the SNRs most revealing of the progressive effect of the selected natural noise (−8.8 dB to −18.4 dB). Our results showed that in such conditions, identity of vowels is mostly preserved, with the striking peculiarity of the absence of confusion in vowels. The results also confirmed the functional role of consonants during lexical identification. The extensive analysis of recognition scores, confusion patterns and associated acoustic cues revealed that sonorant, sibilant and burst properties were the most important parameters influencing phoneme recognition. . Altogether these analyses allowed us to extract a resistance scale from consonant recognition scores. We also identified specific perceptual consonant confusion groups depending of the place in the words (onset vs. coda). Finally our data suggested that listeners may access some acoustic cues of the CV transition, opening interesting perspectives for future studies. 相似文献

2.

Audio-Visual and Meaningful Semantic Context Enhancements in Older and Younger Adults

Kirsten E. Smayda Kristin J. Van Engen W. Todd Maddox Bharath Chandrasekaran 《PloS one》2016,11(3)

Speech perception is critical to everyday life. Oftentimes noise can degrade a speech signal; however, because of the cues available to the listener, such as visual and semantic cues, noise rarely prevents conversations from continuing. The interaction of visual and semantic cues in aiding speech perception has been studied in young adults, but the extent to which these two cues interact for older adults has not been studied. To investigate the effect of visual and semantic cues on speech perception in older and younger adults, we recruited forty-five young adults (ages 18–35) and thirty-three older adults (ages 60–90) to participate in a speech perception task. Participants were presented with semantically meaningful and anomalous sentences in audio-only and audio-visual conditions. We hypothesized that young adults would outperform older adults across SNRs, modalities, and semantic contexts. In addition, we hypothesized that both young and older adults would receive a greater benefit from a semantically meaningful context in the audio-visual relative to audio-only modality. We predicted that young adults would receive greater visual benefit in semantically meaningful contexts relative to anomalous contexts. However, we predicted that older adults could receive a greater visual benefit in either semantically meaningful or anomalous contexts. Results suggested that in the most supportive context, that is, semantically meaningful sentences presented in the audiovisual modality, older adults performed similarly to young adults. In addition, both groups received the same amount of visual and meaningful benefit. Lastly, across groups, a semantically meaningful context provided more benefit in the audio-visual modality relative to the audio-only modality, and the presence of visual cues provided more benefit in semantically meaningful contexts relative to anomalous contexts. These results suggest that older adults can perceive speech as well as younger adults when both semantic and visual cues are available to the listener. 相似文献

3.

EEG gamma-band activity during audiovisual speech comprehension in different noise environments

Yanfei Lin Baolin Liu Zhiwen Liu Xiaorong Gao 《Cognitive neurodynamics》2015,9(4):389-398

The presence of cross-modal stochastic resonance in different noise environments has been proved in previous behavioral and event-related potential studies, while it was still unclear whether the gamma-band oscillation study was another evidence of cross-modal stochastic resonance. The multisensory gain of gamma-band activity between the audiovisual (AV) and auditory-only conditions in different noise environments was analyzed. Videos of face motion articulating words concordant with different levels of pink noise were used as stimuli. Signal-to-noise ratios (SNRs) of 0, −4, −8, −12 and −16 dB were selected to measure the speech recognition accuracy and EEG activity for 20 healthy subjects. The power and phase of EEG gamma-band oscillations increased in a time window of 50–90 ms. The multisensory gains of evoked and total activity, as well as phase-locking factor, were greatest at the −12 dB SNR, which were consistent with the behavioral result. The multisensory gain of gamma-band activity showed an inverted U-shaped curve as a function of SNR. This finding confirmed the presence of cross-modal stochastic resonance. In addition, there was a significant correlation between evoked activity and phase-locking factor of gamma-band at five different SNRs. Gamma-band oscillation was believed to play a role in the rapid processing and information linkage strengthening of AV modalities in the early stage of cognitive processes. 相似文献

4.

Vocal modulation during courtship increases proceptivity even in naive listeners

《Evolution and human behavior》2014,35(6):489-496

Speakers modulate their voice when talking to infants, but we know little about subtle variation in acoustic parameters during speech in adult social interactions. Because tests of perception of such variation are hampered by listeners' understanding of semantic content, studies often confine speech to enunciation of standard sentences, restricting ecological validity. Furthermore, apparent paralinguistic modulation in one language may be underpinned by specific parameters of that language. Here we circumvent these problems by recording speech directed to attractive or unattractive potential partners or competitors, and testing responses to these recordings by naive listeners, across both a Germanic (English) and a Slavic (Czech) language. Analysis of acoustic parameters indicates that men's voices varied F₀ most in speech towards potential attractive versus unattractive mates, while modulation of women's F₀ variability was more sensitive to competitors, with higher variability when those competitors were relatively attractive. There was striking similarity in patterns of social context-dependent F₀ variation across the two model languages, with both men's and women's voices varying most when responding to attractive individuals. Men's minimum pitch was lower when responding to attractive than unattractive women. For vocal modulation to be effective, however, it must be sufficiently detectable to promote proceptivity towards the speaker. We showed that speech directed towards attractive individuals was preferred by naive listeners of either language over speech by the same speaker to unattractive individuals, even when voices were stripped of several acoustic properties by low-pass filtering, which renders speech unintelligible. Our results suggest that modulating F₀ may be a critical parameter in human courtship, independently of semantic content. 相似文献

5.

Vowel Hyperarticulation in Parrot-, Dog- and Infant-Directed Speech

《Anthrozo?s》2013,26(3):373-380

ABSTRACT

Vowel triangle area is a phonetic measure of the clarity of vowel articulation. Compared with speech to adults, people hyperarticulate vowels in speech to infants and foreigners but not to pets, despite other similarities in infant- and pet-directed-speech. This suggests that vowel hyperarticulation has a didactic function positively related to the actual, or even the expected, degree of linguistic competence of the audience. Parrots have some degree of linguistic competence yet no studies have examined vowel hyperarticulation in speech to parrots. Here, we compared the speech of 11 adults to another adult, a dog, a parrot, and an infant. A significant linear increase in vowel triangle area was found across the four conditions, showing that the degree of vowel hyperarticulation increased from adult- and dog-directed speech to parrot-directed speech, then to infant-directed speech. This suggests that the degree of vowel hyperarticulation is related to the audience's actual or expected linguistic competence. The results are discussed in terms of the relative roles of speakers' expectations versus listeners' feedback in the production of vowel hyperarticulation; and suggestions for further studies, manipulating speaker expectation and listener feedback, are provided. 相似文献

6.

Recognizing speech in a novel accent: the motor theory of speech perception reframed

Clément Moulin-Frier Michael A. Arbib 《Biological cybernetics》2013,107(4):421-447

The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener’s native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory. 相似文献

7.

Audiovisual Speech Perception at Various Presentation Levels in Mandarin-Speaking Adults with Cochlear Implants

Shu-Yu Liu Grace Yu Li-Ang Lee Tien-Chen Liu Yung-Ting Tsou Te-Jen Lai Che-Ming Wu 《PloS one》2014,9(9)

Objectives

(1) To evaluate the recognition of words, phonemes and lexical tones in audiovisual (AV) and auditory-only (AO) modes in Mandarin-speaking adults with cochlear implants (CIs); (2) to understand the effect of presentation levels on AV speech perception; (3) to learn the effect of hearing experience on AV speech perception.

Methods

Thirteen deaf adults (age = 29.1±13.5 years; 8 male, 5 female) who had used CIs for >6 months and 10 normal-hearing (NH) adults participated in this study. Seven of them were prelingually deaf, and 6 postlingually deaf. The Mandarin Monosyllablic Word Recognition Test was used to assess recognition of words, phonemes and lexical tones in AV and AO conditions at 3 presentation levels: speech detection threshold (SDT), speech recognition threshold (SRT) and 10 dB SL (re:SRT).

Results

The prelingual group had better phoneme recognition in the AV mode than in the AO mode at SDT and SRT (both p = 0.016), and so did the NH group at SDT (p = 0.004). Mode difference was not noted in the postlingual group. None of the groups had significantly different tone recognition in the 2 modes. The prelingual and postlingual groups had significantly better phoneme and tone recognition than the NH one at SDT in the AO mode (p = 0.016 and p = 0.002 for phonemes; p = 0.001 and p<0.001 for tones) but were outperformed by the NH group at 10 dB SL (re:SRT) in both modes (both p<0.001 for phonemes; p<0.001 and p = 0.002 for tones). The recognition scores had a significant correlation with group with age and sex controlled (p<0.001).

Conclusions

Visual input may help prelingually deaf implantees to recognize phonemes but may not augment Mandarin tone recognition. The effect of presentation level seems minimal on CI users'' AV perception. This indicates special considerations in developing audiological assessment protocols and rehabilitation strategies for implantees who speak tonal languages. 相似文献

8.

Infants’ Preference for Native Audiovisual Speech Dissociated from Congruency Preference

Kathleen Shaw Martijn Baart Nicole Depowski Heather Bortfeld 《PloS one》2015,10(4)

Although infant speech perception in often studied in isolated modalities, infants'' experience with speech is largely multimodal (i.e., speech sounds they hear are accompanied by articulating faces). Across two experiments, we tested infants’ sensitivity to the relationship between the auditory and visual components of audiovisual speech in their native (English) and non-native (Spanish) language. In Experiment 1, infants’ looking times were measured during a preferential looking task in which they saw two simultaneous visual speech streams articulating a story, one in English and the other in Spanish, while they heard either the English or the Spanish version of the story. In Experiment 2, looking times from another group of infants were measured as they watched single displays of congruent and incongruent combinations of English and Spanish audio and visual speech streams. Findings demonstrated an age-related increase in looking towards the native relative to non-native visual speech stream when accompanied by the corresponding (native) auditory speech. This increase in native language preference did not appear to be driven by a difference in preference for native vs. non-native audiovisual congruence as we observed no difference in looking times at the audiovisual streams in Experiment 2. 相似文献

9.

Solutions to the cocktail party problem in insects: selective filters, spatial release from masking and gain control in tropical crickets

Schmidt AK Römer H 《PloS one》2011,6(12):e28593

Background

Insects often communicate by sound in mixed species choruses; like humans and many vertebrates in crowded social environments they thus have to solve cocktail-party-like problems in order to ensure successful communication with conspecifics. This is even more a problem in species-rich environments like tropical rainforests, where background noise levels of up to 60 dB SPL have been measured.

Principal Findings

Using neurophysiological methods we investigated the effect of natural background noise (masker) on signal detection thresholds in two tropical cricket species Paroecanthus podagrosus and Diatrypa sp., both in the laboratory and outdoors. We identified three ‘bottom-up’ mechanisms which contribute to an excellent neuronal representation of conspecific signals despite the masking background. First, the sharply tuned frequency selectivity of the receiver reduces the amount of masking energy around the species-specific calling song frequency. Laboratory experiments yielded an average signal-to-noise ratio (SNR) of −8 dB, when masker and signal were broadcast from the same side. Secondly, displacing the masker by 180° from the signal improved SNRs by further 6 to 9 dB, a phenomenon known as spatial release from masking. Surprisingly, experiments carried out directly in the nocturnal rainforest yielded SNRs of about −23 dB compared with those in the laboratory with the same masker, where SNRs reached only −14.5 and −16 dB in both species. Finally, a neuronal gain control mechanism enhances the contrast between the responses to signals and the masker, by inhibition of neuronal activity in interstimulus intervals.

Conclusions

Thus, conventional speaker playbacks in the lab apparently do not properly reconstruct the masking noise situation in a spatially realistic manner, since under real world conditions multiple sound sources are spatially distributed in space. Our results also indicate that without knowledge of the receiver properties and the spatial release mechanisms the detrimental effect of noise may be strongly overestimated. 相似文献

10.

Internet Video Telephony Allows Speech Reading by Deaf Individuals and Improves Speech Perception by Cochlear Implant Users

Georgios Mantokoudis Claudia D?hler Patrick Dubach Martin Kompis Marco D. Caversaccio Pascal Senn 《PloS one》2013,8(1)

Objective

To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users.

Methods

Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280×720, 640×480, 320×240, 160×120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0–500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed.

Results

Higher frame rate (>7 fps), higher camera resolution (>640×480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032).

Conclusion

Webcameras have the potential to improve telecommunication of hearing-impaired individuals. 相似文献

11.

Let's All Speak Together! Exploring the Masking Effects of Various Languages on Spoken Word Identification in Multi-Linguistic Babble

Aurore Gautreau Michel Hoen Fanny Meunier 《PloS one》2013,8(6)

This study aimed to characterize the linguistic interference that occurs during speech-in-speech comprehension by combining offline and online measures, which included an intelligibility task (at a −5 dB Signal-to-Noise Ratio) and 2 lexical decision tasks (at a −5 dB and 0 dB SNR) that were performed with French spoken target words. In these 3 experiments we always compared the masking effects of speech backgrounds (i.e., 4-talker babble) that were produced in the same language as the target language (i.e., French) or in unknown foreign languages (i.e., Irish and Italian) to the masking effects of corresponding non-speech backgrounds (i.e., speech-derived fluctuating noise). The fluctuating noise contained similar spectro-temporal information as babble but lacked linguistic information. At −5 dB SNR, both tasks revealed significantly divergent results between the unknown languages (i.e., Irish and Italian) with Italian and French hindering French target word identification to a similar extent, whereas Irish led to significantly better performances on these tasks. By comparing the performances obtained with speech and fluctuating noise backgrounds, we were able to evaluate the effect of each language. The intelligibility task showed a significant difference between babble and fluctuating noise for French, Irish and Italian, suggesting acoustic and linguistic effects for each language. However, the lexical decision task, which reduces the effect of post-lexical interference, appeared to be more accurate, as it only revealed a linguistic effect for French. Thus, although French and Italian had equivalent masking effects on French word identification, the nature of their interference was different. This finding suggests that the differences observed between the masking effects of Italian and Irish can be explained at an acoustic level but not at a linguistic level. 相似文献

12.

Acoustic signal perception in a noisy habitat: lessons from synchronising insects

Hartbauer M Siegert ME Fertschai I Römer H 《Journal of comparative physiology. A, Neuroethology, sensory, neural, and behavioral physiology》2012,198(6):397-409

Acoustically communicating animals often have to cope with ambient noise that has the potential to interfere with the perception of conspecific signals. Here we use the synchronous display of mating signals in males of the tropical katydid Mecopoda elongata in order to assess the influence of nocturnal rainforest noise on signal perception. Loud background noise may disturb chorus synchrony either by masking the signals of males or by interaction of noisy events with the song oscillator. Phase-locked synchrony of males was studied under various signal-to-noise ratios (SNRs) using either native noise or the audio component of noise (<9 kHz). Synchronous entrainment was lost at a SNR of -3 dB when native noise was used, whereas with the audio component still 50% of chirp periods matched the pacer period at a SNR of -7 dB. Since the chirp period of solo singing males remained almost unaffected by noise, our results suggest that masking interference limits chorus synchrony by rendering conspecific signals ambiguous. Further, entrainment with periodic artificial signals indicates that synchrony is achieved by ignoring heterospecific signals and attending to a conspecific signal period. Additionally, the encoding of conspecific chirps was studied in an auditory neuron under the same background noise regimes. 相似文献

13.

Aided and Unaided Speech Perception by Older Hearing Impaired Listeners

David L. Woods Tanya Arbogast Zoe Doss Masood Younus Timothy J. Herron E. William Yund 《PloS one》2015,10(3)

The most common complaint of older hearing impaired (OHI) listeners is difficulty understanding speech in the presence of noise. However, tests of consonant-identification and sentence reception threshold (SeRT) provide different perspectives on the magnitude of impairment. Here we quantified speech perception difficulties in 24 OHI listeners in unaided and aided conditions by analyzing (1) consonant-identification thresholds and consonant confusions for 20 onset and 20 coda consonants in consonant-vowel-consonant (CVC) syllables presented at consonant-specific signal-to-noise (SNR) levels, and (2) SeRTs obtained with the Quick Speech in Noise Test (QSIN) and the Hearing in Noise Test (HINT). Compared to older normal hearing (ONH) listeners, nearly all unaided OHI listeners showed abnormal consonant-identification thresholds, abnormal consonant confusions, and reduced psychometric function slopes. Average elevations in consonant-identification thresholds exceeded 35 dB, correlated strongly with impairments in mid-frequency hearing, and were greater for hard-to-identify consonants. Advanced digital hearing aids (HAs) improved average consonant-identification thresholds by more than 17 dB, with significant HA benefit seen in 83% of OHI listeners. HAs partially normalized consonant-identification thresholds, reduced abnormal consonant confusions, and increased the slope of psychometric functions. Unaided OHI listeners showed much smaller elevations in SeRTs (mean 6.9 dB) than in consonant-identification thresholds and SeRTs in unaided listening conditions correlated strongly (r = 0.91) with identification thresholds of easily identified consonants. HAs produced minimal SeRT benefit (2.0 dB), with only 38% of OHI listeners showing significant improvement. HA benefit on SeRTs was accurately predicted (r = 0.86) by HA benefit on easily identified consonants. Consonant-identification tests can accurately predict sentence processing deficits and HA benefit in OHI listeners. 相似文献

14.

Categorical Vowel Perception Enhances the Effectiveness and Generalization of Auditory Feedback in Human-Machine-Interfaces

Eric Larson Howard P. Terry Margaux M. Canevari Cara E. Stepp 《PloS one》2013,8(3)

Human-machine interface (HMI) designs offer the possibility of improving quality of life for patient populations as well as augmenting normal user function. Despite pragmatic benefits, utilizing auditory feedback for HMI control remains underutilized, in part due to observed limitations in effectiveness. The goal of this study was to determine the extent to which categorical speech perception could be used to improve an auditory HMI. Using surface electromyography, 24 healthy speakers of American English participated in 4 sessions to learn to control an HMI using auditory feedback (provided via vowel synthesis). Participants trained on 3 targets in sessions 1–3 and were tested on 3 novel targets in session 4. An “established categories with text cues” group of eight participants were trained and tested on auditory targets corresponding to standard American English vowels using auditory and text target cues. An “established categories without text cues” group of eight participants were trained and tested on the same targets using only auditory cuing of target vowel identity. A “new categories” group of eight participants were trained and tested on targets that corresponded to vowel-like sounds not part of American English. Analyses of user performance revealed significant effects of session and group (established categories groups and the new categories group), and a trend for an interaction between session and group. Results suggest that auditory feedback can be effectively used for HMI operation when paired with established categorical (native vowel) targets with an unambiguous cue. 相似文献

15.

Suppression of the μ Rhythm during Speech and Non-Speech Discrimination Revealed by Independent Component Analysis: Implications for Sensorimotor Integration in Speech Processing

Andrew Bowers Tim Saltuklaroglu Ashley Harkrider Megan Cuellar 《PloS one》2013,8(8)

Background

Constructivist theories propose that articulatory hypotheses about incoming phonetic targets may function to enhance perception by limiting the possibilities for sensory analysis. To provide evidence for this proposal, it is necessary to map ongoing, high-temporal resolution changes in sensorimotor activity (i.e., the sensorimotor μ rhythm) to accurate speech and non-speech discrimination performance (i.e., correct trials.)

Methods

Sixteen participants (15 female and 1 male) were asked to passively listen to or actively identify speech and tone-sweeps in a two-force choice discrimination task while the electroencephalograph (EEG) was recorded from 32 channels. The stimuli were presented at signal-to-noise ratios (SNRs) in which discrimination accuracy was high (i.e., 80–100%) and low SNRs producing discrimination performance at chance. EEG data were decomposed using independent component analysis and clustered across participants using principle component methods in EEGLAB.

Results

ICA revealed left and right sensorimotor µ components for 14/16 and 13/16 participants respectively that were identified on the basis of scalp topography, spectral peaks, and localization to the precentral and postcentral gyri. Time-frequency analysis of left and right lateralized µ component clusters revealed significant (pFDR<.05) suppression in the traditional beta frequency range (13–30 Hz) prior to, during, and following syllable discrimination trials. No significant differences from baseline were found for passive tasks. Tone conditions produced right µ beta suppression following stimulus onset only. For the left µ, significant differences in the magnitude of beta suppression were found for correct speech discrimination trials relative to chance trials following stimulus offset.

Conclusions

Findings are consistent with constructivist, internal model theories proposing that early forward motor models generate predictions about likely phonemic units that are then synthesized with incoming sensory cues during active as opposed to passive processing. Future directions and possible translational value for clinical populations in which sensorimotor integration may play a functional role are discussed. 相似文献

16.

Nitric Oxide Synthase-3 Promotes Embryonic Development of Atrioventricular Valves

Yin Liu Xiangru Lu Fu-Li Xiang Man Lu Qingping Feng 《PloS one》2013,8(10)

Nitric oxide synthase-3 (NOS3) has recently been shown to promote endothelial-to-mesenchymal transition (EndMT) in the developing atrioventricular (AV) canal. The present study was aimed to investigate the role of NOS3 in embryonic development of AV valves. We hypothesized that NOS3 promotes embryonic development of AV valves via EndMT. To test this hypothesis, morphological and functional analysis of AV valves were performed in wild-type (WT) and NOS3^−/− mice at postnatal day 0. Our data show that the overall size and length of mitral and tricuspid valves were decreased in NOS3^−/− compared with WT mice. Echocardiographic assessment showed significant regurgitation of mitral and tricuspid valves during systole in NOS3^−/− mice. These phenotypes were all rescued by cardiac specific NOS3 overexpression. To assess EndMT, immunostaining of Snail1 was performed in the embryonic heart. Both total mesenchymal and Snail1⁺ cells in the AV cushion were decreased in NOS3^−/− compared with WT mice at E10.5 and E12.5, which was completely restored by cardiac specific NOS3 overexpression. In cultured embryonic hearts, NOS3 promoted transforming growth factor (TGFβ), bone morphogenetic protein (BMP2) and Snail1expression through cGMP. Furthermore, mesenchymal cell formation and migration from cultured AV cushion explants were decreased in the NOS3^−/− compared with WT mice. We conclude that NOS3 promotes AV valve formation during embryonic heart development and deficiency in NOS3 results in AV valve insufficiency. 相似文献

17.

How Noise and Language Proficiency Influence Speech Recognition by Individual Non-Native Listeners

Jin Zhang Lingli Xie Yongjun Li Monita Chatterjee Nai Ding 《PloS one》2014,9(11)

This study investigated how speech recognition in noise is affected by language proficiency for individual non-native speakers. The recognition of English and Chinese sentences was measured as a function of the signal-to-noise ratio (SNR) in sixty native Chinese speakers who never lived in an English-speaking environment. The recognition score for speech in quiet (which varied from 15%–92%) was found to be uncorrelated with speech recognition threshold (SRT_Q _/2), i.e. the SNR at which the recognition score drops to 50% of the recognition score in quiet. This result demonstrates separable contributions of language proficiency and auditory processing to speech recognition in noise. 相似文献

18.

Beat gestures influence which speech sounds you hear

Hans Rutger Bosker David Peeters 《Proceedings. Biological sciences / The Royal Society》2021,288(1943)

Beat gestures—spontaneously produced biphasic movements of the hand—are among the most frequently encountered co-speech gestures in human communication. They are closely temporally aligned to the prosodic characteristics of the speech signal, typically occurring on lexically stressed syllables. Despite their prevalence across speakers of the world''s languages, how beat gestures impact spoken word recognition is unclear. Can these simple ‘flicks of the hand'' influence speech perception? Across a range of experiments, we demonstrate that beat gestures influence the explicit and implicit perception of lexical stress (e.g. distinguishing OBject from obJECT), and in turn can influence what vowels listeners hear. Thus, we provide converging evidence for a manual McGurk effect: relatively simple and widely occurring hand movements influence which speech sounds we hear. 相似文献

19.

Young Children Understand the Normative Implications of Future-Directed Speech Acts

Karoline Lohse Maria Gr?fenhain Tanya Behne Hannes Rakoczy 《PloS one》2014,9(1)

Much recent research has shown that the capacity for mental time travel and temporal reasoning emerges during the preschool years. Nothing is known so far, however, about young children''s grasp of the normative dimension of future-directed thought and speech. The present study is the first to show that children from age 4 understand the normative outreach of such future-directed speech acts: subjects at time 1 witnessed a speaker make future-directed speech acts about/towards an actor A, either in imperative mode (“A, do X!”) or as a prediction (“the actor A will do X”). When at time 2 the actor A performed an action that did not match the content of the speech act at time 1, children identified the speaker as the source of a mistake in the prediction case, and the actor as the source of the mistake in the imperative case and leveled criticism accordingly. These findings add to our knowledge about the emergence and development of temporal cognition in revealing an early sensitivity to the normative aspects of future-orientation. 相似文献

20.

Involvement of Right STS in Audio-Visual Integration for Affective Speech Demonstrated Using MEG

Cindy C. Hagan Will Woods Sam Johnson Gary G. R. Green Andrew W. Young 《PloS one》2013,8(8)

Speech and emotion perception are dynamic processes in which it may be optimal to integrate synchronous signals emitted from different sources. Studies of audio-visual (AV) perception of neutrally expressed speech demonstrate supra-additive (i.e., where AV>[unimodal auditory+unimodal visual]) responses in left STS to crossmodal speech stimuli. However, emotions are often conveyed simultaneously with speech; through the voice in the form of speech prosody and through the face in the form of facial expression. Previous studies of AV nonverbal emotion integration showed a role for right (rather than left) STS. The current study therefore examined whether the integration of facial and prosodic signals of emotional speech is associated with supra-additive responses in left (cf. results for speech integration) or right (due to emotional content) STS. As emotional displays are sometimes difficult to interpret, we also examined whether supra-additive responses were affected by emotional incongruence (i.e., ambiguity). Using magnetoencephalography, we continuously recorded eighteen participants as they viewed and heard AV congruent emotional and AV incongruent emotional speech stimuli. Significant supra-additive responses were observed in right STS within the first 250 ms for emotionally incongruent and emotionally congruent AV speech stimuli, which further underscores the role of right STS in processing crossmodal emotive signals. 相似文献