首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We address the hypothesis that postures adopted during grammatical pauses in speech production are more “mechanically advantageous” than absolute rest positions for facilitating efficient postural motor control of vocal tract articulators. We quantify vocal tract posture corresponding to inter-speech pauses, absolute rest intervals as well as vowel and consonant intervals using automated analysis of video captured with real-time magnetic resonance imaging during production of read and spontaneous speech by 5 healthy speakers of American English. We then use locally-weighted linear regression to estimate the articulatory forward map from low-level articulator variables to high-level task/goal variables for these postures. We quantify the overall magnitude of the first derivative of the forward map as a measure of mechanical advantage. We find that postures assumed during grammatical pauses in speech as well as speech-ready postures are significantly more mechanically advantageous than postures assumed during absolute rest. Further, these postures represent empirical extremes of mechanical advantage, between which lie the postures assumed during various vowels and consonants. Relative mechanical advantage of different postures might be an important physical constraint influencing planning and control of speech production.  相似文献   

2.
The notion that linguistic forms and meanings are related only by convention and not by any direct relationship between sounds and semantic concepts is a foundational principle of modern linguistics. Though the principle generally holds across the lexicon, systematic exceptions have been identified. These “sound symbolic” forms have been identified in lexical items and linguistic processes in many individual languages. This paper examines sound symbolism in the languages of Australia. We conduct a statistical investigation of the evidence for several common patterns of sound symbolism, using data from a sample of 120 languages. The patterns examined here include the association of meanings denoting “smallness” or “nearness” with front vowels or palatal consonants, and the association of meanings denoting “largeness” or “distance” with back vowels or velar consonants. Our results provide evidence for the expected associations of vowels and consonants with meanings of “smallness” and “proximity” in Australian languages. However, the patterns uncovered in this region are more complicated than predicted. Several sound-meaning relationships are only significant for segments in prominent positions in the word, and the prevailing mapping between vowel quality and magnitude meaning cannot be characterized by a simple link between gradients of magnitude and vowel F2, contrary to the claims of previous studies.  相似文献   

3.
A key feature of speech is its stereotypical 5 Hz rhythm. One theory posits that this rhythm evolved through the modification of rhythmic facial movements in ancestral primates. If the hypothesis has any validity, then a comparative approach may shed some light. We tested this idea by using cineradiography (X-ray movies) to characterize and quantify the internal dynamics of the macaque monkey vocal tract during lip-smacking (a rhythmic facial expression) versus chewing. Previous human studies showed that speech movements are faster than chewing movements, and the functional coordination between vocal tract structures is different between the two behaviors. If rhythmic speech evolved through a rhythmic ancestral facial movement, then one hypothesis is that monkey lip-smacking versus chewing should also exhibit these differences. We found that the lips, tongue, and hyoid move with a speech-like 5 Hz rhythm during lip-smacking, but not during chewing. Most importantly, the functional coordination between these structures was distinct for each behavior. These data provide empirical support for the idea that the human speech rhythm evolved from the rhythmic facial expressions of ancestral primates.  相似文献   

4.
Inferences on the evolution of human speech based on anatomical data must take into account its physiology, acoustics and perception. Human speech is generated by the supralaryngeal vocal tract (SVT) acting as an acoustic filter on noise sources generated by turbulent airflow and quasi-periodic phonation generated by the activity of the larynx. The formant frequencies, which are major determinants of phonetic quality, are the frequencies at which relative energy maxima will pass through the SVT filter. Neither the articulatory gestures of the tongue nor their acoustic consequences can be fractionated into oral and pharyngeal cavity components. Moreover, the acoustic cues that specify individual consonants and vowels are “encoded”, i.e., melded together. Formant frequency encoding makes human speech a vehicle for rapid vocal communication. Non-human primates lack the anatomy that enables modern humans to produce sounds that enhance this process, as well as the neural mechanisms necessary for the voluntary control of speech articulation. The specific claims of Duchin (1990) are discussed.  相似文献   

5.
Rhythmic sensory or electrical stimulation will produce rhythmic brain responses. These rhythmic responses are often interpreted as endogenous neural oscillations aligned (or “entrained”) to the stimulus rhythm. However, stimulus-aligned brain responses can also be explained as a sequence of evoked responses, which only appear regular due to the rhythmicity of the stimulus, without necessarily involving underlying neural oscillations. To distinguish evoked responses from true oscillatory activity, we tested whether rhythmic stimulation produces oscillatory responses which continue after the end of the stimulus. Such sustained effects provide evidence for true involvement of neural oscillations. In Experiment 1, we found that rhythmic intelligible, but not unintelligible speech produces oscillatory responses in magnetoencephalography (MEG) which outlast the stimulus at parietal sensors. In Experiment 2, we found that transcranial alternating current stimulation (tACS) leads to rhythmic fluctuations in speech perception outcomes after the end of electrical stimulation. We further report that the phase relation between electroencephalography (EEG) responses and rhythmic intelligible speech can predict the tACS phase that leads to most accurate speech perception. Together, we provide fundamental results for several lines of research—including neural entrainment and tACS—and reveal endogenous neural oscillations as a key underlying principle for speech perception.

Just as a child on a swing continues to move after the pushing stops, this study reveals similar entrained rhythmic echoes in brain activity after hearing speech and electrical brain stimulation; perturbation with tACS shows that these brain oscillations help listeners to understand speech.  相似文献   

6.
A central challenge for articulatory speech synthesis is the simulation of realistic articulatory movements, which is critical for the generation of highly natural and intelligible speech. This includes modeling coarticulation, i.e., the context-dependent variation of the articulatory and acoustic realization of phonemes, especially of consonants. Here we propose a method to simulate the context-sensitive articulation of consonants in consonant-vowel syllables. To achieve this, the vocal tract target shape of a consonant in the context of a given vowel is derived as the weighted average of three measured and acoustically-optimized reference vocal tract shapes for that consonant in the context of the corner vowels /a/, /i/, and /u/. The weights are determined by mapping the target shape of the given context vowel into the vowel subspace spanned by the corner vowels. The model was applied for the synthesis of consonant-vowel syllables with the consonants /b/, /d/, /g/, /l/, /r/, /m/, /n/ in all combinations with the eight long German vowels. In a perception test, the mean recognition rate for the consonants in the isolated syllables was 82.4%. This demonstrates the potential of the approach for highly intelligible articulatory speech synthesis.  相似文献   

7.
Songbirds are one of the few groups of animals that learn the sounds used for vocal communication during development. Like humans, songbirds memorize vocal sounds based on auditory experience with vocalizations of adult “tutors”, and then use auditory feedback of self-produced vocalizations to gradually match their motor output to the memory of tutor sounds. In humans, investigations of early vocal learning have focused mainly on perceptual skills of infants, whereas studies of songbirds have focused on measures of vocal production. In order to fully exploit songbirds as a model for human speech, understand the neural basis of learned vocal behavior, and investigate links between vocal perception and production, studies of songbirds must examine both behavioral measures of perception and neural measures of discrimination during development. Here we used behavioral and electrophysiological assays of the ability of songbirds to distinguish vocal calls of varying frequencies at different stages of vocal learning. The results show that neural tuning in auditory cortex mirrors behavioral improvements in the ability to make perceptual distinctions of vocal calls as birds are engaged in vocal learning. Thus, separate measures of neural discrimination and behavioral perception yielded highly similar trends during the course of vocal development. The timing of this improvement in the ability to distinguish vocal sounds correlates with our previous work showing substantial refinement of axonal connectivity in cortico-basal ganglia pathways necessary for vocal learning.  相似文献   

8.
The last decades evidenced auditory laterality in vertebrates, offering new important insights for the understanding of the origin of human language. Factors such as the social (e.g. specificity, familiarity) and emotional value of sounds have been proved to influence hemispheric specialization. However, little is known about the crossed effect of these two factors in animals. In addition, human-animal comparative studies, using the same methodology, are rare. In our study, we adapted the head turn paradigm, a widely used non invasive method, on 8–9-year-old schoolgirls and on adult female Campbell''s monkeys, by focusing on head and/or eye orientations in response to sound playbacks. We broadcast communicative signals (monkeys: calls, humans: speech) emitted by familiar individuals presenting distinct degrees of social value (female monkeys: conspecific group members vs heterospecific neighbours, human girls: from the same vs different classroom) and emotional value (monkeys: contact vs threat calls; humans: friendly vs aggressive intonation). We evidenced a crossed-categorical effect of social and emotional values in both species since only “negative” voices from same class/group members elicited a significant auditory laterality (Wilcoxon tests: monkeys, T = 0 p = 0.03; girls: T = 4.5 p = 0.03). Moreover, we found differences between species as a left and right hemisphere preference was found respectively in humans and monkeys. Furthermore while monkeys almost exclusively responded by turning their head, girls sometimes also just moved their eyes. This study supports theories defending differential roles played by the two hemispheres in primates'' auditory laterality and evidenced that more systematic species comparisons are needed before raising evolutionary scenario. Moreover, the choice of sound stimuli and behavioural measures in such studies should be the focus of careful attention.  相似文献   

9.
The capacity of nonhuman primates to actively modify the acoustic structure of existing sounds or vocalizations in their repertoire appears limited. Several studies have reported population or community differences in the acoustical structure of nonhuman primate long distance calls and have suggested vocal learning as a mechanism for explaining such variation. In addition, recent studies on great apes have indicated that there are repertoire differences between populations. Some populations have sounds in their repertoire that others have not. These differences have also been suggested to be the result of vocal learning. On yet another level great apes can, after extensive human training, also learn some species atypical vocalizations. Here we show a new aspect of great ape vocal learning by providing data that an orangutan has spontaneously (without any training) acquired a human whistle and can modulate the duration and number of whistles to copy a human model. This might indicate that the learning capacities of great apes in the auditory domain might be more flexible than hitherto assumed. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

10.
The evolutionary origins of the use of speech signals to refer to events or objects in the world have remained obscure. Although functionally referential calls have been described in some monkey species, studies with our closest living relatives, the great apes, have not generated comparable findings. These negative results have been taken to suggest that ape vocalizations are not the product of their otherwise sophisticated mentality and that ape gestural communication is more informative for theories of language evolution. We tested whether chimpanzee rough grunts, which are produced during feeding contexts, functioned as referential signals. Individuals produced acoustically distinct types of "rough grunts" when encountering different foods. In a naturalistic playback experiment, a focal subject was able to use the information conveyed by these calls produced by several group mates to guide his search for food, demonstrating that the different grunt types were meaningful to him. This study provides experimental evidence that our closest living relatives can produce and understand functionally referential calls as part of their natural communication. We suggest that these findings give support to the vocal rather than gestural theories of language evolution.  相似文献   

11.
It was recently shown that rhythmic entrainment, long considered a human-specific mechanism, can be demonstrated in a selected group of bird species, and, somewhat surprisingly, not in more closely related species such as nonhuman primates. This observation supports the vocal learning hypothesis that suggests rhythmic entrainment to be a by-product of the vocal learning mechanisms that are shared by several bird and mammal species, including humans, but that are only weakly developed, or missing entirely, in nonhuman primates. To test this hypothesis we measured auditory event-related potentials (ERPs) in two rhesus monkeys (Macaca mulatta), probing a well-documented component in humans, the mismatch negativity (MMN) to study rhythmic expectation. We demonstrate for the first time in rhesus monkeys that, in response to infrequent deviants in pitch that were presented in a continuous sound stream using an oddball paradigm, a comparable ERP component can be detected with negative deflections in early latencies (Experiment 1). Subsequently we tested whether rhesus monkeys can detect gaps (omissions at random positions in the sound stream; Experiment 2) and, using more complex stimuli, also the beat (omissions at the first position of a musical unit, i.e. the ‘downbeat’; Experiment 3). In contrast to what has been shown in human adults and newborns (using identical stimuli and experimental paradigm), the results suggest that rhesus monkeys are not able to detect the beat in music. These findings are in support of the hypothesis that beat induction (the cognitive mechanism that supports the perception of a regular pulse from a varying rhythm) is species-specific and absent in nonhuman primates. In addition, the findings support the auditory timing dissociation hypothesis, with rhesus monkeys being sensitive to rhythmic grouping (detecting the start of a rhythmic group), but not to the induced beat (detecting a regularity from a varying rhythm).  相似文献   

12.
Pigs are models in human phoniatry. However, features of maturation and ageing have not been considered with regard to the so-called body-cover model in this species. Therefore, the glottis of “young” (2–3 months; n = 6) and “old” (4–7 years; n = 6) minipigs was investigated. Their cranial (CraF) and caudal (CauF) vocal folds were histomorphometrically and stratigraphically analysed with emphasis on their amounts of collagen structures and elastic fibres. A dense subepithelial layer (SEL) was a distinct feature of CraF and CauF of both age groups; it was spread upon the underlying loose, flexible “cover” like a fibro-elastic membrane. The “cover” was characterised by the so-called superficial layer (SL), which was distinctly loose in the “young” minipigs, but had a much denser texture in the “old” minipigs. Here, the SL was dominated by elastic fibres in the CraF, but was of mixed qualities (collagenous and elastic) in the CauF. The structural requirements for the SL’s function as a loose “cover” were thus met only in the “young” animals. A clearly demarcated intermediate layer (IL)—characterised by high amounts of elastic fibres (as in humans)—was only found in the CraF of the “young” animals. In the “old” animals, it had lost its demarcation. In the depth of the CraF of the “old” animals, many thick collagen fibre bundles were detected in a location equivalent to that of the vocal muscle in the CauF. The development of their large diameters was interpreted as part of the maturation process, thereby supporting the hypothesis of their functional importance as a component of the “body.” In the CauF, the amounts of collagen structures increased throughout the entire lamina propria, resulting in a loss of demarcated stratigraphical subdivisions in the “old” minipigs. This situation resembled that described in the vocal fold of geriatric humans.  相似文献   

13.
Human beings are thought to be unique amongst the primates in their capacity to produce rapid changes in the shape of their vocal tracts during speech production. Acoustically, vocal tracts act as resonance chambers, whose geometry determines the position and bandwidth of the formants. Formants provide the acoustic basis for vowels, which enable speakers to refer to external events and to produce other kinds of meaningful communication. Formant-based referential communication is also present in non-human primates, most prominently in Diana monkey alarm calls. Previous work has suggested that the acoustic structure of these calls is the product of a non-uniform vocal tract capable of some degree of articulation. In this study we test this hypothesis by providing morphological measurements of the vocal tract of three adult Diana monkeys, using both radiography and dissection. We use these data to generate a vocal tract computational model capable of simulating the formant structures produced by wild individuals. The model performed best when it combined a non-uniform vocal tract consisting of three different tubes with a number of articulatory manoeuvres. We discuss the implications of these findings for evolutionary theories of human and non-human vocal production.  相似文献   

14.
Virtually every human faculty engage with imitation. One of the most natural and unexplored objects for the study of the mimetic elements in language is the onomatopoeia, as it implies an imitative-driven transformation of a sound of nature into a word. Notably, simple sounds are transformed into complex strings of vowels and consonants, making difficult to identify what is acoustically preserved in this operation. In this work we propose a definition for vocal imitation by which sounds are transformed into the speech elements that minimize their spectral difference within the constraints of the vocal system. In order to test this definition, we use a computational model that allows recovering anatomical features of the vocal system from experimental sound data. We explore the vocal configurations that best reproduce non-speech sounds, like striking blows on a door or the sharp sounds generated by pressing on light switches or computer mouse buttons. From the anatomical point of view, the configurations obtained are readily associated with co-articulated consonants, and we show perceptual evidence that these consonants are positively associated with the original sounds. Moreover, the pairs vowel-consonant that compose these co-articulations correspond to the most stable syllables found in the knock and click onomatopoeias across languages, suggesting a mechanism by which vocal imitation naturally embeds single sounds into more complex speech structures. Other mimetic forces received extensive attention by the scientific community, such as cross-modal associations between speech and visual categories. The present approach helps building a global view of the mimetic forces acting on language and opens a new venue for a quantitative study of word formation in terms of vocal imitation.  相似文献   

15.
Compared to humans, non-human primates have very little control over their vocal production. Nonetheless, some primates produce various call combinations, which may partially offset their lack of acoustic flexibility. A relevant example is male Campbell''s monkeys (Cercopithecus campbelli), which give one call type (‘Krak’) to leopards, while the suffixed version of the same call stem (‘Krak-oo’) is given to unspecific danger. To test whether recipients attend to this suffixation pattern, we carried out a playback experiment in which we broadcast naturally and artificially modified suffixed and unsuffixed ‘Krak’ calls of male Campbell''s monkeys to 42 wild groups of Diana monkeys (Cercopithecus diana diana). The two species form mixed-species groups and respond to each other''s vocalizations. We analysed the vocal response of male and female Diana monkeys and overall found significantly stronger vocal responses to unsuffixed (leopard) than suffixed (unspecific danger) calls. Although the acoustic structure of the ‘Krak’ stem of the calls has some additional effects, subject responses were mainly determined by the presence or the absence of the suffix. This study indicates that suffixation is an evolved function in primate communication in contexts where adaptive responses are particularly important.  相似文献   

16.
Many nonhuman primates produce food-associated vocalizations upon encountering or ingesting particular food. Concerning the great apes, only food-associated vocalizations of chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) have been studied in detail, providing evidence that these vocalizations can be produced flexibly in relation to a variety of factors, such as the quantity and quality of food and/or the type of audience. Only anecdotal evidence exists of eastern (Gorilla beringei) and western gorillas (Gorilla gorilla) producing food-associated vocalizations, termed singing or humming. To enable a better understanding of the context in which these calls are produced, we investigated and compared the vocal behavior of two free-ranging groups of western lowland gorillas (Gorilla g. gorilla) at Mondika, Republic of Congo. Our results show that (a) food-associated call production occurs only during feeding and not in other contexts; (b) calling is not uniformly distributed across age and sex classes; (c) calls are only produced during feeding on specific foods; and (d) normally just one individual gives calls during group feeding sessions, however, certain food types elicit simultaneous calling of two or more individuals. Our findings provide new insight into the vocal abilities of gorillas but also carry larger implications for questions concerning vocal variability among the great apes. Food-associated calls of nonhuman primates have been shown to be flexible in terms of when they are used and who they are directed at, making them interesting vocalizations from the viewpoint of language evolution. Food-associated vocalizations in great apes can offer new opportunities to investigate the phylogenetic development of vocal communication within the primate lineage and can possibly contribute novel insights into the origins of human language.  相似文献   

17.
Theories of music evolution agree that human music has an affective influence on listeners. Tests of non-humans provided little evidence of preferences for human music. However, prosodic features of speech (‘motherese’) influence affective behaviour of non-verbal infants as well as domestic animals, suggesting that features of music can influence the behaviour of non-human species. We incorporated acoustical characteristics of tamarin affiliation vocalizations and tamarin threat vocalizations into corresponding pieces of music. We compared music composed for tamarins with that composed for humans. Tamarins were generally indifferent to playbacks of human music, but responded with increased arousal to tamarin threat vocalization based music, and with decreased activity and increased calm behaviour to tamarin affective vocalization based music. Affective components in human music may have evolutionary origins in the structure of calls of non-human animals. In addition, animal signals may have evolved to manage the behaviour of listeners by influencing their affective state.  相似文献   

18.
Phylogenomic analysis of the occurrence and abundance of protein domains in proteomes has recently showed that the α/β architecture is probably the oldest fold design. This holds important implications for the origins of biochemistry. Here we explore structure-function relationships addressing the use of chemical mechanisms by ancestral enzymes. We test the hypothesis that the oldest folds used the most mechanisms. We start by tracing biocatalytic mechanisms operating in metabolic enzymes along a phylogenetic timeline of the first appearance of homologous superfamilies of protein domain structures from CATH. A total of 335 enzyme reactions were retrieved from MACiE and were mapped over fold age. We define a mechanistic step type as one of the 51 mechanistic annotations given in MACiE, and each step of each of the 335 mechanisms was described using one or more of these annotations. We find that the first two folds, the P-loop containing nucleotide triphosphate hydrolase and the NAD(P)-binding Rossmann-like homologous superfamilies, were α/β architectures responsible for introducing 35% (18/51) of the known mechanistic step types. We find that these two oldest structures in the phylogenomic analysis of protein domains introduced many mechanistic step types that were later combinatorially spread in catalytic history. The most common mechanistic step types included fundamental building blocks of enzyme chemistry: “Proton transfer,” “Bimolecular nucleophilic addition,” “Bimolecular nucleophilic substitution,” and “Unimolecular elimination by the conjugate base.” They were associated with the most ancestral fold structure typical of P-loop containing nucleotide triphosphate hydrolases. Over half of the mechanistic step types were introduced in the evolutionary timeline before the appearance of structures specific to diversified organisms, during a period of architectural diversification. The other half unfolded gradually after organismal diversification and during a period that spanned ∼2 billion years of evolutionary history.  相似文献   

19.
Previous studies have shown that bimanual coordination learning is more resistant to the removal of augmented feedback when acquired with auditory than with visual channel. However, it is unclear whether this differential “guidance effect” between feedback modalities is due to enhanced sensorimotor integration via the non-dominant auditory channel or strengthened linkage to kinesthetic information under rhythmic input. The current study aimed to examine how modalities (visual vs. auditory) and information types (continuous visuospatial vs. discrete rhythmic) of concurrent augmented feedback influence bimanual coordination learning. Participants either learned a 90°-out-of-phase pattern for three consecutive days with Lissajous feedback indicating the integrated position of both arms, or with visual or auditory rhythmic feedback reflecting the relative timing of the movement. The results showed diverse performance change after practice when the feedback was removed between Lissajous and the other two rhythmic groups, indicating that the guidance effect may be modulated by the type of information provided during practice. Moreover, significant performance improvement in the dual-task condition where the irregular rhythm counting task was applied as a secondary task also suggested that lower involvement of conscious control may result in better performance in bimanual coordination.  相似文献   

20.
This paper reviews progress in understanding the psychology of lipreading and audio-visual speech perception. It considers four questions. What distinguishes better from poorer lipreaders? What are the effects of introducing a delay between the acoustical and optical speech signals? What have attempts to produce computer animations of talking faces contributed to our understanding of the visual cues that distinguish consonants and vowels? Finally, how should the process of audio-visual integration in speech perception be described; that is, how are the sights and sounds of talking faces represented at their conflux?  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号