首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Source–filter theory assumes that calls are generated by a vocal source and are subsequently filtered by the vocal tract. The air in the vocal tract vibrates preferentially at certain resonant frequencies, called formants. Formant frequencies can be a good indicator of the caller's characteristics, such as sex, age, body size or individual identity. Although source–filter theory was originally proposed for mammals, formants are also observed in birds, and some bird species have been shown to perceive formants. In this study, we evaluated the hypotheses that formant frequencies (1) are an indicator of body size and (2) can be used for individual discrimination by a nocturnal bird species, the corncrake (Crex crex). We analysed calls of 104 males from Poland and the Czech Republic. Linear regression models showed that the males with a longer head (including the bill length) had a significantly lower formant dispersion and lower fourth and fifth formant frequencies. However, we found no significant relationships between body weight and any filter‐related acoustic measurement. The formant frequencies had smaller within‐ than between‐individual coefficients of variation. This characteristic of the formant frequencies implies a high potential for individual coding. A discriminant function analysis correctly assigned 94.8% of the calls to the caller based on formants from second to fifth. Our results indicated that the formant frequencies are a weak indicator of the body size of the sender in the corncrake. However, even weak dependence between body size and acoustic properties of signal may be important in natural selection process. Alternatively, such a weak dependence may be observed, because receivers ignore the acoustical, formant‐based cues of body size. Simultaneously, the formants might potentially provide acoustic cues to individual discrimination and could be used to census and monitoring tasks.  相似文献   

3.
Four male Long-Evans rats were trained to discriminate between synthetic vowel sounds using a GO/NOGO response choice task. The vowels were characterized by an increase in fundamental frequency correlated with an upward shift in formant frequencies. In an initial phase we trained the subjects to discriminate between two vowel categories using two exemplars from each category. In a subsequent phase the ability of the rats to generalize the discrimination between the two categories was tested. To test whether rats might exploit the fact that attributes of training stimuli covaried, we used non-standard stimuli with a reversed relation between fundamental frequency and formants. The overall results demonstrate that rats are able to generalize the discrimination to new instances of the same vowels. We present evidence that the performance of the subjects depended on the relation between fundamental and formant frequencies that they had previously been exposed to. Simple simulation results with artificial neural networks could reproduce most of the behavioral results and support the hypothesis that equivalence classes for vowels are associated with an experience-driven process based on general properties of peripheral auditory coding mixed with elementary learning mechanisms. These results suggest that rats use spectral and temporal cues similarly to humans despite differences in basic auditory capabilities.  相似文献   

4.
Hu C  Wang Q  Short LA  Fu G 《PloS one》2012,7(3):e33906
The current study explored the correlation between speakers' Eysenck personality traits and speech spectrum parameters. Forty-six subjects completed the Eysenck Personality Questionnaire. They were instructed to verbally answer the questions shown on a computer screen and their responses were recorded by the computer. Spectrum parameters of /sh/ and /i/ were analyzed by Praat voice software. Formant frequencies of the consonant /sh/ in lying responses were significantly lower than that in truthful responses, whereas no difference existed on the vowel /i/ speech spectrum. The second formant bandwidth of the consonant /sh/ speech spectrum was significantly correlated with the personality traits of Psychoticism, Extraversion, and Neuroticism, and the correlation differed between truthful and lying responses, whereas the first formant frequency of the vowel /i/ speech spectrum was negatively correlated with Neuroticism in both response types. The results suggest that personality characteristics may be conveyed through the human voice, although the extent to which these effects are due to physiological differences in the organs associated with speech or to a general Pygmalion effect is yet unknown.  相似文献   

5.
We propose a new model for speaker-independent vowel recognition which uses the flexibility of the dynamic linking that results from the synchronization of oscillating neural units. The system consists of an input layer and three neural layers, which are referred to as the A-, B- and C-centers. The input signals are a time series of linear prediction (LPC) spectrum envelopes of auditory signals. At each time-window within the series, the A-center receives input signals and extracts local peaks of the spectrum envelope, i.e., formants, and encodes them into local groups of independent oscillations. Speaker-independent vowel characteristics are embedded as a connection matrix in the B-center according to statistical data of Japanese vowels. The associative interaction in the B-center and reciprocal interaction between the A- and B-centers selectively activate a vowel as a global synchronized pattern over two centers. The C-center evaluates the synchronized activities among the three formant regions to give the selective output of the category among the five Japanese vowels. Thus, a flexible ability of dynamical linking among features is achieved over the three centers. The capability in the present system was investigated for speaker-independent recognition of Japanese vowels. The system demonstrated a remarkable ability for the recognition of vowels very similar to that of human listeners, including misleading vowels. In addition, it showed stable recognition for unsteady input signals and robustness against background noise. The optimum condition of the frequency of oscillation is discussed in comparison with stimulus-dependent synchronizations observed in neurophysiological experiments of the cortex. Received: 20 July 1993/Accepted in revised form: 22 December 1993  相似文献   

6.
Noise-vocoded (NV) speech is often regarded as conveying phonetic information primarily through temporal-envelope cues rather than spectral cues. However, listeners may infer the formant frequencies in the vocal-tract output-a key source of phonetic detail-from across-band differences in amplitude when speech is processed through a small number of channels. The potential utility of this spectral information was assessed for NV speech created by filtering sentences into six frequency bands, and using the amplitude envelope of each band (≤30 Hz) to modulate a matched noise-band carrier (N). Bands were paired, corresponding to F1 (≈N1 + N2), F2 (≈N3 + N4) and the higher formants (F3' ≈ N5 + N6), such that the frequency contour of each formant was implied by variations in relative amplitude between bands within the corresponding pair. Three-formant analogues (F0 = 150 Hz) of the NV stimuli were synthesized using frame-by-frame reconstruction of the frequency and amplitude of each formant. These analogues were less intelligible than the NV stimuli or analogues created using contours extracted from spectrograms of the original sentences, but more intelligible than when the frequency contours were replaced with constant (mean) values. Across-band comparisons of amplitude envelopes in NV speech can provide phonetically important information about the frequency contours of the underlying formants.  相似文献   

7.
The perception of vowels was studied in chimpanzees and humans, using a reaction time task in which reaction times for discrimination of vowels were taken as an index of similarity between vowels. Vowels used were five synthetic and natural Japanese vowels and eight natural French vowels. The chimpanzees required long reaction times for discrimination of synthetic [i] from [u] and [e] from [o], that is, they need long latencies for discrimination between vowels based on differences in frequency of the second formant. A similar tendency was observed for discrimination of natural [i] from [u]. The human subject required long reaction times for discrimination between vowels along the first formant axis. These differences can be explained by differences in auditory sensitivity between the two species and the motor theory of speech perception. A vowel, which is pronounced by different speakers, has different acoustic properties. However, humans can perceive these speech sounds as the same vowel. The phenomenon of perceptual constancy in speech perception was studied in chimpanzees using natural vowels and a synthetic [o]- [a] continuum. The chimpanzees ignored the difference in the sex of the speakers and showed a capacity for vocal tract normalization.  相似文献   

8.
A permanently descended larynx is found in humans and several other species of mammals. In addition to this, the larynx of species such as fallow deer is mobile and in males it can be retracted during vocalization. The most likely explanation for the lowered retractable larynx in mammals is that it serves to exaggerate perceived body size (size exaggeration hypothesis) by decreasing the formant frequencies of calls. In this study, we quantified for the first time the elongation of the vocal tract in fallow bucks during vocalization. We also measured the effect of this vocal tract length (VTL) increase on formant frequencies (vocal tract resonances) and formant dispersion (spacing of formants). Our results show that fallow bucks increase their VTL on average by 52% during vocalization. This elongation resulted in strongly lowered formant frequencies and decreased formant dispersion. There were minimal changes to formants 1 and 2 (−0.91 and +1.9%, respectively) during vocal tract elongation, whereas formants 3, 4 and 5 decreased substantially: 18.9, 10.3 and 13.6%, respectively. Formant dispersion decreased by 12.4%. Formants are prominent in deer vocalizations and are used by males to gain information on the competitive abilities of signallers. It remains to be seen whether females also use the information that formants contain for assessing male quality before mating.  相似文献   

9.
Vocal-tract resonances (or formants) are acoustic signatures in the voice and are related to the shape and length of the vocal tract. Formants play an important role in human communication, helping us not only to distinguish several different speech sounds [1], but also to extract important information related to the physical characteristics of the speaker, so-called indexical cues. How did formants come to play such an important role in human vocal communication? One hypothesis suggests that the ancestral role of formant perception--a role that might be present in extant nonhuman primates--was to provide indexical cues [2-5]. Although formants are present in the acoustic structure of vowel-like calls of monkeys [3-8] and implicated in the discrimination of call types [8-10], it is not known whether they use this feature to extract indexical cues. Here, we investigate whether rhesus monkeys can use the formant structure in their "coo" calls to assess the age-related body size of conspecifics. Using a preferential-looking paradigm [11, 12] and synthetic coo calls in which formant structure simulated an adult/large- or juvenile/small-sounding individual, we demonstrate that untrained monkeys attend to formant cues and link large-sounding coos to large faces and small-sounding coos to small faces-in essence, they can, like humans [13], use formants as indicators of age-related body size.  相似文献   

10.
11.
目的:评价口温蜡在软腭缺损修复中制取缺损腔功能印模的应用效果。方法:对11例硬软腭缺损患者分别采用口温蜡和藻酸盐制取缺损腔印模,制作阻塞器,使用两种阻塞器各1个月后,比较其戴口温蜡取模制作的阻塞器时(甲组)、戴藻酸盐取模制作的阻塞器时(乙组)和不戴阻塞器时(丙组)三种情况下,患者口鼻漏情况的主观满意度、语音清晰度(Speech Inteligibility,SI)以及单韵母频谱分析值,分别对其进行比较。结果:甲组的口鼻漏满意度、语音清晰度及单韵母[i]音F1、F2、[u]音F2、[ü]音F2的频率值均显著高于其余两组(P0.05)。结论:口温蜡取模制作的阻塞器能够明显改善患者因腭咽闭合功能不全造成的口鼻漏、言语障碍等状况。  相似文献   

12.
The social vocalizations of the oilbird (Steatornis caripensis) frequently have their acoustic energy concentrated into 3 prominent formants which appear to arise from the filter properties of their asymmetrical vocal tract with its bronchial syrinx. The frequency of the second and third formants approximate the predicted fundamental resonances of the unequal left and right cranial portions of each primary bronchus, respectively. Reversibly plugging either bronchus eliminates the corresponding formant. The first formant may arise in the trachea. The degree of vocal tract asymmetry varies between individuals, endowing them with different formant frequencies and providing potential acoustic cues by which individuals of this nocturnal, cave dwelling species may recognize each other in their dark, crowded colonies.  相似文献   

13.
M Latinus  P Belin 《PloS one》2012,7(7):e41384
Humans can identify individuals from their voice, suggesting the existence of a perceptual representation of voice identity. We used perceptual aftereffects - shifts in perceived stimulus quality after brief exposure to a repeated adaptor stimulus - to further investigate the representation of voice identity in two experiments. Healthy adult listeners were familiarized with several voices until they reached a recognition criterion. They were then tested on identification tasks that used vowel stimuli generated by morphing between the different identities, presented either in isolation (baseline) or following short exposure to different types of voice adaptors (adaptation). Experiment 1 showed that adaptation to a given voice induced categorization shifts away from that adaptor's identity even when the adaptors consisted of vowels different from the probe stimuli. Moreover, original voices and caricatures resulted in comparable aftereffects, ruling out an explanation of identity aftereffects in terms of adaptation to low-level features. In Experiment 2, we show that adaptors with a disrupted configuration, i.e., altered fundamental frequency or formant frequencies, failed to produce perceptual aftereffects showing the importance of the preserved configuration of these acoustical cues in the representation of voices. These two experiments indicate a high-level, dynamic representation of voice identity based on the combination of several lower-level acoustical features into a specific voice configuration.  相似文献   

14.
The purpose of this study was: (i) to provide additional evidence regarding the existence of human voice parameters, which could be reliable indicators of a speaker's physical characteristics and (ii) to examine the ability of listeners to judge voice pleasantness and a speaker's characteristics from speech samples. We recorded 26 men enunciating five vowels. Voices were played to 102 female judges who were asked to assess vocal attractiveness and speakers' age, height and weight. Statistical analyses were used to determine: (i) which physical component predicted which vocal component and (ii) which vocal component predicted which judgment. We found that men with low-frequency formants and small formant dispersion tended to be older, taller and tended to have a high level of testosterone. Female listeners were consistent in their pleasantness judgment and in their height, weight and age estimates. Pleasantness judgments were based mainly on intonation. Female listeners were able to correctly estimate age by using formant components. They were able to estimate weight but we could not explain which acoustic parameters they used. However, female listeners were not able to estimate height, possibly because they used intonation incorrectly. Our study confirms that in all mammal species examined thus far, including humans, formant components can provide a relatively accurate indication of a vocalizing individual's characteristics. Human listeners have the necessary information at their disposal; however, they do not necessarily use it.  相似文献   

15.
Voice quality was assessed in 55 patients with the laryngeal carcinoma. A quality of voice was examined in 18 patients before and after chordectomy and in 37 patients before and after supraglottic surgery. Subjective and objective spectrography methods were applied to evaluate dysphony. The larynx was examined by indirect larngoscopy and videolaryngostroboscopy (VLSS). Significant voice pathology was found in patients before surgery when compared with the normal group. A change of voice colour was found, which was manifested in spectrography by decreased in formant levels, especially F3 and F4 in patients after supraglottic surgery. Dysphagia and longer tracheostomy were temporary complications after the surgery and resulted in further phoniatric rehabilitation. Early phoniatric rehabilitation after chordectomy helped to achieve subjective and objective improvement of voice quality in patients after surgery. Good voice quality in patients after chordectomy is due to preserved structure and increased levels F1, F2, F3, and F4 formants in spectrography.  相似文献   

16.

Background

It is usually possible to identify the sex of a pre-pubertal child from their voice, despite the absence of sex differences in fundamental frequency at these ages. While it has been suggested that the overall spacing between formants (formant frequency spacing - ΔF) is a key component of the expression and perception of sex in children''s voices, the effect of its continuous variation on sex and gender attribution has not yet been investigated.

Methodology/Principal findings

In the present study we manipulated voice ΔF of eight year olds (two boys and two girls) along continua covering the observed variation of this parameter in pre-pubertal voices, and assessed the effect of this variation on adult ratings of speakers'' sex and gender in two separate experiments. In the first experiment (sex identification) adults were asked to categorise the voice as either male or female. The resulting identification function exhibited a gradual slope from male to female voice categories. In the second experiment (gender rating), adults rated the voices on a continuum from “masculine boy” to “feminine girl”, gradually decreasing their masculinity ratings as ΔF increased.

Conclusions/Significance

These results indicate that the role of ΔF in voice gender perception, which has been reported in adult voices, extends to pre-pubertal children''s voices: variation in ΔF not only affects the perceived sex, but also the perceived masculinity or femininity of the speaker. We discuss the implications of these observations for the expression and perception of gender in children''s voices given the absence of anatomical dimorphism in overall vocal tract length before puberty.  相似文献   

17.
18.
While vocal tract resonances or formants are key acoustic parameters that define differences between phonemes in human speech, little is known about their function in animal communication. Here, we used playback experiments to present red deer stags with re-synthesized vocalizations in which formant frequencies were systematically altered to simulate callers of different body sizes. In response to stimuli where lower formants indicated callers with longer vocal tracts, stags were more attentive, replied with more roars and extended their vocal tracts further in these replies. Our results indicate that mammals other than humans use formants in vital vocal exchanges and can adjust their own formant frequencies in relation to those that they hear.  相似文献   

19.
Abstract

Introduction: Deep brain stimulation (DBS) is a standard surgical treatment method which is generally applied to subthalamic nucleus in Parkinson’s patients in cases where medical treatment is insufficient in treating the motor symptoms. It is known that Subthalamic Nucleus Deep Brain Stimulation (STN-DBS) treats many motor symptoms. However, the results of studies on speech and voice vary. The aim of the study is analysing the effect of STN-DBS on the characteristics of voice.

Materials/methods: A total of 12 patients, (8 male–4 female) with an age average of 58.8?±?9.6, who have been applied DBS surgery on STN included in the study. The voice recordings of the patients have been done prior to surgery and 6?months after the surgery. The evaluation of voice has been carried out through the instrumental method. The patients’ voice recordings of the /a,e,i/ vowels have been done. The obtained recordings were evaluated by the Praat programme and the effects on jhitter, shimmer, fundamental frequency (F0) and noise harmonic rate (NHR) were analysed.

Results: Numerical values of F0 of all female participants have been decreased for all of the vowels postoperatively. In the females; jhitter and fraction parameters were found to be significantly different (0.056 and 0.017, perspectively) for the vowel /e/. In addition, p values in the shimmer for vowels /e,i/ were thought to be clinically significant (.087, .079 and .076) respectively. All these changes in second measurements were found to indicate worsening vocal quality after the DBS in females. In males, there is not any significant difference observed between two measures in any of the parameters of any vowels.

Conclusions: Acoustic voice quality deteriorated after STN-DBS predominantly for females however this deterioration was not prominent audio-perceptually. This finding commented as a result of the fact that that voice quality deviance of the participants was not severe.  相似文献   

20.
Although formants (vocal tract resonances) can often be observed in avian vocalizations, and several bird species have been shown to perceive formants in human speech sounds, no studies have examined formant perception in birds' own species-specific calls. We used playbacks of computer-synthesized crane calls in a modified habituation—dishabituation paradigm to test for formant perception in whooping cranes ( Grus americana ). After habituating birds to recordings of natural contact calls, we played a synthesized replica of one of the habituating stimuli as a control to ensure that the synthesizer worked adequately; birds dishabituated in only one of 13 cases. Then, we played the same call with its formant frequencies shifted. The birds dishabituated to the formant-shifted calls in 10 out of 12 playbacks. These data suggest that cranes perceive and attend to changes in formant frequencies in their own species-specific vocalizations, and are consistent with the hypothesis that formants can provide acoustic cues to individuality and body size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号