首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.

Background

The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding ‘rapid temporal processing’.

Methodology

A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech) which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET) was used to compare which brain regions were active when participants listened to the different sounds.

Conclusions

Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible) was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features.  相似文献   

3.
Vocal-tract resonances (or formants) are acoustic signatures in the voice and are related to the shape and length of the vocal tract. Formants play an important role in human communication, helping us not only to distinguish several different speech sounds [1], but also to extract important information related to the physical characteristics of the speaker, so-called indexical cues. How did formants come to play such an important role in human vocal communication? One hypothesis suggests that the ancestral role of formant perception--a role that might be present in extant nonhuman primates--was to provide indexical cues [2-5]. Although formants are present in the acoustic structure of vowel-like calls of monkeys [3-8] and implicated in the discrimination of call types [8-10], it is not known whether they use this feature to extract indexical cues. Here, we investigate whether rhesus monkeys can use the formant structure in their "coo" calls to assess the age-related body size of conspecifics. Using a preferential-looking paradigm [11, 12] and synthetic coo calls in which formant structure simulated an adult/large- or juvenile/small-sounding individual, we demonstrate that untrained monkeys attend to formant cues and link large-sounding coos to large faces and small-sounding coos to small faces-in essence, they can, like humans [13], use formants as indicators of age-related body size.  相似文献   

4.
We carried out a comparative study of spectral-prosodic characteristics of bird vocalization and human speech. Comparison was made between the relative characteristics of the fundamental frequency and spectral maxima. Criteria were formulated for the comparison of bird's signals and human speech. A certain correspondence was found between the vocal structures of birds and humans. It was proposed that in the course of evolution, man adopted the main structural principles of his acoustic signalling from birds.  相似文献   

5.
Heterochronic formation of basic and language-specific speech sounds in the first year of life in infants from different ethnic groups (Chechens, Russians, and Mongols) has been studied. Spectral analysis of the frequency, amplitude, and formant characteristics of speech sounds has shown a universal pattern of organization of the basic sound repertoire and “language-specific” sounds in the process of babbling and prattle of infants of different ethnic groups. Possible mechanisms of the formation of specific speech sounds in early ontogeny are discussed.  相似文献   

6.
Dental prosthesis is a foreign body in oral cavity and thus necessarily interferes with speech articulation. The purpose of this study was to examine influence of partial denture on speech quality and to show eventual differences in pronunciation of dental sounds c[ts], z [z], s [s] and postalveolar sounds c [t], z [3] and s [integral of]. We have examined differences in pronunciation between subjects with removable partial dentures, the same group without partial dentures and a control group. The study was performed on 30 subjects with removable partial dentures and 30 subjects with complete dental arch. All subjects were recorded while reading six Croatian words containing the examined sounds. Recordings were analyzed with Multispeech Program (Kay Elemetrics Inc.). Acoustic analysis--LPC (linear prediction coding) provided formant peaks (Hz) for each examined sound, its intensity (dB) and formant bandwidths (Hz). Results showed that subjects with partial dentures had 50% less distorted variables and that prostheses did not completely restore articulation of postalveolar sounds. Groups with and without prostheses had lower formant peaks intensities and wider formant bandwidths in comparison to the control group. Partial dentures have not significantly interfered with resonance frequency. At the same time, pronunciation of the examined sounds was significantly improved. However, precision of the articulation movements has deteriorated.  相似文献   

7.
Distributional learning of speech sounds (i.e., learning from simple exposure to frequency distributions of speech sounds in the environment) has been observed in the lab repeatedly in both infants and adults. The current study is the first attempt to examine whether the capacity for using the mechanism is different in adults than in infants. To this end, a previous event-related potential study that had shown distributional learning of the English vowel contrast /æ/∼/ε/ in 2-to-3-month old Dutch infants was repeated with Dutch adults. Specifically, the adults were exposed to either a bimodal distribution that suggested the existence of the two vowels (as appropriate in English), or to a unimodal distribution that did not (as appropriate in Dutch). After exposure the participants were tested on their discrimination of a representative [æ] and a representative [ε], in an oddball paradigm for measuring mismatch responses (MMRs). Bimodally trained adults did not have a significantly larger MMR amplitude, and hence did not show significantly better neural discrimination of the test vowels, than unimodally trained adults. A direct comparison between the normalized MMR amplitudes of the adults with those of the previously tested infants showed that within a reasonable range of normalization parameters, the bimodal advantage is reliably smaller in adults than in infants, indicating that distributional learning is a weaker mechanism for learning speech sounds in adults (if it exists in that group at all) than in infants.  相似文献   

8.
Previous research suggests that nonhuman primates have limited flexibility in the frequency content of their vocalizations, particularly when compared to human speech. Consistent with this notion, several nonhuman primate species have demonstrated noise-induced changes in call amplitude and duration, with no evidence of changes to spectral content. This experiment used broad- and narrow-band noise playbacks to investigate the vocal control of two call types produced by cotton-top tamarins (Saguinus Oedipus). In ‘combination long calls’ (CLCs), peak fundamental frequency and the distribution of energy between low and high frequency harmonics (spectral tilt) changed in response to increased noise amplitude and bandwidth. In chirps, peak and maximum components of the fundamental frequency increased with increasing noise level, with no changes to spectral tilt. Other modifications included the Lombard effect and increases in chirp duration. These results provide the first evidence for noise-induced frequency changes in nonhuman primate vocalizations and suggest that future investigations of vocal plasticity in primates should include spectral parameters.  相似文献   

9.
The sounds of human speech make human language a rapid medium of communication through a process of speech "encoding." The presence of sounds like the vowels [a], [i], and [u] makes this process possible. The supralaryngeal vocal tracts of newborn Homo sapiens and chimpanzee are similar and resemble the reconstructed vocal tract of the fossil La Chapelle-aux-Saints Neanderthal man. Vocal tract area functions that were directed toward making best possible approximations to the human vowels [a], [i], and [u], as well as certain consonantal configurations, were modeled by means of a computer program. The lack of these vowels in the phonetic repertories of these creatures, who lack a supralaryngeal pharyngeal region like that of adult Homo sapiens, may be concomitant with the absence of speech encoding and a consequently linguistic ability inferior to modern man.  相似文献   

10.
In this study, it is shown that the males of several picture-winged Drosophila subgroup species produced high-frequency clicking sounds when courting females. At the beginning of the courtship, the males may semaphore or vibrate their wings with a large amplitude, producing no audible sounds. After these ‘preliminary’ wing vibrations the males set their wings backwards in a normal resting position and vibrate them with a small amplitude, producing loud clicking sounds (up to 15000 cps), which differ from all Drosophila sounds described so far in both their spectral and their temporal structure. When producing these sounds the males always touch the abdomen of the female with their front legs, which might help the females receive the sounds as vibrational signals.  相似文献   

11.
Pulse-resonance sounds play an important role in animal communication and auditory object recognition, yet very little is known about the cortical representation of this class of sounds. In this study we shine light on one simple aspect: how well does the firing rate of cortical neurons resolve resonant (“formant”) frequencies of vowel-like pulse-resonance sounds. We recorded neural responses in the primary auditory cortex (A1) of anesthetized rats to two-formant pulse-resonance sounds, and estimated their formant resolving power using a statistical kernel smoothing method which takes into account the natural variability of cortical responses. While formant-tuning functions were diverse in structure across different penetrations, most were sensitive to changes in formant frequency, with a frequency resolution comparable to that reported for rat cochlear filters.  相似文献   

12.
Bottlenose dolphins (Tursiops truncatus) use the frequency contour of whistles produced by conspecifics for individual recognition. Here we tested a bottlenose dolphin’s (Tursiops truncatus) ability to recognize frequency modulated whistle-like sounds using a three alternative matching-to-sample paradigm. The dolphin was first trained to select a specific object (object A) in response to a specific sound (sound A) for a total of three object-sound associations. The sounds were then transformed by amplitude, duration, or frequency transposition while still preserving the frequency contour of each sound. For comparison purposes, 30 human participants completed an identical task with the same sounds, objects, and training procedure. The dolphin’s ability to correctly match objects to sounds was robust to changes in amplitude with only a minor decrement in performance for short durations. The dolphin failed to recognize sounds that were frequency transposed by plus or minus ½ octaves. Human participants demonstrated robust recognition with all acoustic transformations. The results indicate that this dolphin’s acoustic recognition of whistle-like sounds was constrained by absolute pitch. Unlike human speech, which varies considerably in average frequency, signature whistles are relatively stable in frequency, which may have selected for a whistle recognition system invariant to frequency transposition.  相似文献   

13.
Speech processing inherently relies on the perception of specific, rapidly changing spectral and temporal acoustic features. Advanced acoustic perception is also integral to musical expertise, and accordingly several studies have demonstrated a significant relationship between musical training and superior processing of various aspects of speech. Speech and music appear to overlap in spectral and temporal features; however, it remains unclear which of these acoustic features, crucial for speech processing, are most closely associated with musical training. The present study examined the perceptual acuity of musicians to the acoustic components of speech necessary for intra-phonemic discrimination of synthetic syllables. We compared musicians and non-musicians on discrimination thresholds of three synthetic speech syllable continua that varied in their spectral and temporal discrimination demands, specifically voice onset time (VOT) and amplitude envelope cues in the temporal domain. Musicians demonstrated superior discrimination only for syllables that required resolution of temporal cues. Furthermore, performance on the temporal syllable continua positively correlated with the length and intensity of musical training. These findings support one potential mechanism by which musical training may selectively enhance speech perception, namely by reinforcing temporal acuity and/or perception of amplitude rise time, and implications for the translation of musical training to long-term linguistic abilities.  相似文献   

14.
Soundscape ecology evaluates biodiversity and environmental disturbances by investigating the interaction among soundscape components (biological, geophysical, and human-produced sounds) using data collected with autonomous recording units. Current analyses consider the acoustic properties of frequency and amplitude resulting in varied metrics, but rarely focus on the discrimination of soundscape components. Computational musicologists analyze similar data but consider a third acoustic property, timbre.Here, we investigated the effectiveness of spectral timbral analysis to distinguish among dominant soundscape components. This process included manually labeling and extracting spectral timbral features for each recording. Then, we tested classification accuracy with linear and quadratic discriminant analyses on combinations of spectral timbral features.Different spectral timbral feature groups distinguished between biological, geophysical, and manmade sounds in a single field recording. Furthermore, as we tested different combinations of spectral timbral features that resulted in both high and very low accuracy results, we found that they could be ordered to “sift” out field recordings by individual dominant soundscape component.By using timbre as a new acoustic property in soundscape analyses, we could classify dominant soundscape components effectively. We propose further investigation into a sifting scheme that may allow researchers to focus on more specific research questions such as understanding changes in biodiversity, discriminating by taxonomic class, or to inspect weather-related events.  相似文献   

15.
We measured the time and frequency domain characteristics of breath sounds in seven asthmatic and three nonasthmatic wheezing patients. The power spectra of the wheezes were evaluated for frequency, amplitude, and timing of peaks of power and for the presence of an exponential decay of power with increasing frequency. Such decay is typical of normal vesicular breath sounds. Two patients who had the most severe asthma had no exponential decay pattern in their spectra. Other asthmatic patients had exponential patterns in some of their analyzed sound segments, with a range of slopes of the log power vs. log frequency curves from 5.7 to 17.3 dB/oct (normal range, 9.8-15.7 dB/oct). The nonasthmatic wheezing patients had normal exponential patterns in most of their analyzed sound segments. All patients had sharp peaks of power in many of the spectra of their expiratory and inspiratory lung sounds. The frequency range of the spectral peaks was 80-1,600 Hz, with some presenting constant frequency peaks throughout numerous inspiratory or expiratory sound segments recorded from one or more pickup locations. We compared the spectral shape, mode of appearance, and frequency range of wheezes with specific predictions of five theories of wheeze production: 1) turbulence-induced wall resonator, 2) turbulence-induced Helmholtz resonator, 3) acoustically stimulated vortex sound (whistle), 4) vortex-induced wall resonator, and 5) fluid dynamic flutter. We conclude that the predictions by 4 and 5 match the experimental observations better than the previously suggested mechanisms. Alterations in the exponential pattern are discussed in view of the mechanisms proposed as underlying the generation and transmission of normal lung sounds. The observed changes may reflect modified sound production in the airways or alterations in their attenuation when transmitted to the chest wall through the hyperinflated lung.  相似文献   

16.
Inferences on the evolution of human speech based on anatomical data must take into account its physiology, acoustics and perception. Human speech is generated by the supralaryngeal vocal tract (SVT) acting as an acoustic filter on noise sources generated by turbulent airflow and quasi-periodic phonation generated by the activity of the larynx. The formant frequencies, which are major determinants of phonetic quality, are the frequencies at which relative energy maxima will pass through the SVT filter. Neither the articulatory gestures of the tongue nor their acoustic consequences can be fractionated into oral and pharyngeal cavity components. Moreover, the acoustic cues that specify individual consonants and vowels are “encoded”, i.e., melded together. Formant frequency encoding makes human speech a vehicle for rapid vocal communication. Non-human primates lack the anatomy that enables modern humans to produce sounds that enhance this process, as well as the neural mechanisms necessary for the voluntary control of speech articulation. The specific claims of Duchin (1990) are discussed.  相似文献   

17.
ABSTRACT

Among teleosts, only representatives of several tropical catfish families have evolved two sonic organs: pectoral spines for stridulation and swimbladder drumming muscles. Pectoral mechanisms differ in relative size between pimelodids, mochokids and doradids, whereas swimbladder mechanisms exhibit differences in origin and insertion of extrinsic muscles. Differences in vocalization among families were investigated by comparing distress calls in air and underwater. High frequency broad-band pulsed sounds of similar duration were emitted during abduction of pectoral spines in all three families. Adduction sounds were similar to abduction signals in doradids, shorter and of lower sound pressure in mochokids, and totally lacking in pimelodids. Simultaneously or successively with pectoral sounds, low frequency harmonic drumming sounds were produced by representatives of two families. Drumming sounds were of similar intensity as stridulatory sounds in pimelodids, fainter in doradids, and not present in mochokids. Swimbladder sounds were frequency modulated and the fundamental frequency was similar in pimelodids and doradids. The ratio of stridulatory to drumming sound amplitude was higher in air than underwater in both doradids and one of the pimelodids. Also, overall duration of pectoral sounds, compared to swimbladder sounds, was longer in air than underwater in one doradid and pimelodid species. This first comparison of vocalization within one major teleost order demonstrates a wide variation in occurrence, duration, intensity and spectral content of sounds and indicates family- and species-specific as well as context- (receiver-) dependent patterns of vocalization.  相似文献   

18.
Spatial frequency is a fundamental visual feature coded in primary visual cortex, relevant for perceiving textures, objects, hierarchical structures, and scenes, as well as for directing attention and eye movements. Temporal amplitude-modulation (AM) rate is a fundamental auditory feature coded in primary auditory cortex, relevant for perceiving auditory objects, scenes, and speech. Spatial frequency and temporal AM rate are thus fundamental building blocks of visual and auditory perception. Recent results suggest that crossmodal interactions are commonplace across the primary sensory cortices and that some of the underlying neural associations develop through consistent multisensory experience such as audio-visually perceiving speech, gender, and objects. We demonstrate that people consistently and absolutely (rather than relatively) match specific auditory AM rates to specific visual spatial frequencies. We further demonstrate that this crossmodal mapping allows amplitude-modulated sounds to guide attention to and modulate awareness of specific visual spatial frequencies. Additional results show that the crossmodal association is approximately linear, based on physical spatial frequency, and generalizes to tactile pulses, suggesting that the association develops through multisensory experience during manual exploration of surfaces.  相似文献   

19.
Characteristics of acoustic waves accompanying the flight of noctuid moths (Noctuidae) were measured. The low-frequency part of the spectrum is formed of a series of up to 17 harmonics of the wingbeat frequency (30–50 Hz) with a general tendency toward the decrease in the spectral density and the increase in the sound frequency. The root-mean-square level of the sound pressure from flapping wings was found to be 70–78 dB SPL. Besides low-frequency components, the flight of moths was accompanied by short ultrasonic pulses, which appeared with every wingbeat. Most of the spectral energy was concentrated within a range of 7–150 kHz with the main peaks at 60–110 kHz. The short-term pulses were divided into two or more subpulses with different spectra. The high-frequency pulses were produced at two phases of the wingbeat cycle: during the pronation of the wings at the highest point and at the beginning of their upward movement from the lowest point. In most of the specimens tested, the peak amplitude of sounds varied from 55 to 65 dB SPL at a distance of 6 cm from the insect body. However, in nine noctuid species, no high-frequency acoustic components were recorded. In these experiments, the acoustic flow from the flying moth within a frequency range of 2 to 20 kHz did not exceed the self-noise level of the microphone amplifier (RMS 18 dB SPL). Probable mechanisms of the high frequency acoustic emission during flight, the effect of these sounds on the auditory sensitivity of moths, and the possibility of their self-revealing to insectivorous bats are discussed. In addition, spectral characteristics of the moth echolocation clicks were more precisely determined within the higher frequency range (>100 kHz).  相似文献   

20.
Selective attention is the mechanism that allows focusing one’s attention on a particular stimulus while filtering out a range of other stimuli, for instance, on a single conversation in a noisy room. Attending to one sound source rather than another changes activity in the human auditory cortex, but it is unclear whether attention to different acoustic features, such as voice pitch and speaker location, modulates subcortical activity. Studies using a dichotic listening paradigm indicated that auditory brainstem processing may be modulated by the direction of attention. We investigated whether endogenous selective attention to one of two speech signals affects amplitude and phase locking in auditory brainstem responses when the signals were either discriminable by frequency content alone, or by frequency content and spatial location. Frequency-following responses to the speech sounds were significantly modulated in both conditions. The modulation was specific to the task-relevant frequency band. The effect was stronger when both frequency and spatial information were available. Patterns of response were variable between participants, and were correlated with psychophysical discriminability of the stimuli, suggesting that the modulation was biologically relevant. Our results demonstrate that auditory brainstem responses are susceptible to efferent modulation related to behavioral goals. Furthermore they suggest that mechanisms of selective attention actively shape activity at early subcortical processing stages according to task relevance and based on frequency and spatial cues.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号