首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The processing of audio-visual speech: empirical and neural bases   总被引:2,自引:0,他引:2  
In this selective review, I outline a number of ways in which seeing the talker affects auditory perception of speech, including, but not confined to, the McGurk effect. To date, studies suggest that all linguistic levels are susceptible to visual influence, and that two main modes of processing can be described: a complementary mode, whereby vision provides information more efficiently than hearing for some under-specified parts of the speech stream, and a correlated mode, whereby vision partially duplicates information about dynamic articulatory patterning.Cortical correlates of seen speech suggest that at the neurological as well as the perceptual level, auditory processing of speech is affected by vision, so that 'auditory speech regions' are activated by seen speech. The processing of natural speech, whether it is heard, seen or heard and seen, activates the perisylvian language regions (left>right). It is highly probable that activation occurs in a specific order. First, superior temporal, then inferior parietal and finally inferior frontal regions (left>right) are activated. There is some differentiation of the visual input stream to the core perisylvian language system, suggesting that complementary seen speech information makes special use of the visual ventral processing stream, while for correlated visual speech, the dorsal processing stream, which is sensitive to visual movement, may be relatively more involved.  相似文献   

2.
Most infants who are later diagnosed with autism show delayed speech and language and/or atypical language profile. There is a large body of research on abnormal speech and language in children with autism. However, auditory development has been relatively under-investigated in autism research, despite its inextricable relationship with language development and despite researchers' ability to detect abnormalities in brain development and behavior in early infancy. In this review, we synthesize research on auditory processing in the prenatal period through infancy and childhood in typically developing children, children at high risk for autism, and children diagnosed with autism. We conclude that there are clear neurobiological and behavioral links between abnormal auditory development and the deficits in social communication seen in autism. We then offer perspectives on the need for a systematic characterization of early auditory development in autism, and identified questions to be addressed in future research on the development of autism.  相似文献   

3.
Among topics related to the evolution of language, the evolution of speech is particularly fascinating. Early theorists believed that it was the ability to produce articulate speech that set the stage for the evolution of the «special» speech processing abilities that exist in modern-day humans. Prior to the evolution of speech production, speech processing abilities were presumed not to exist. The data reviewed here support a different view. Two lines of evidence, one from young human infants and the other from infrahuman species, neither of whom can produce articulate speech, show that in the absence of speech production capabilities, the perception of speech sounds is robust and sophisticated. Human infants and non-human animals evidence auditory perceptual categories that conform to those defined by the phonetic categories of language. These findings suggest the possibility that in evolutionary history the ability to perceive rudimentary speech categories preceded the ability to produce articulate speech. This in turn suggests that it may be audition that structured, at least initially, the formation of phonetic categories.  相似文献   

4.
This paper reviews the basic aspects of auditory processing that play a role in the perception of speech. The frequency selectivity of the auditory system, as measured using masking experiments, is described and used to derive the internal representation of the spectrum (the excitation pattern) of speech sounds. The perception of timbre and distinctions in quality between vowels are related to both static and dynamic aspects of the spectra of sounds. The perception of pitch and its role in speech perception are described. Measures of the temporal resolution of the auditory system are described and a model of temporal resolution based on a sliding temporal integrator is outlined. The combined effects of frequency and temporal resolution can be modelled by calculation of the spectro-temporal excitation pattern, which gives good insight into the internal representation of speech sounds. For speech presented in quiet, the resolution of the auditory system in frequency and time usually markedly exceeds the resolution necessary for the identification or discrimination of speech sounds, which partly accounts for the robust nature of speech perception. However, for people with impaired hearing, speech perception is often much less robust.  相似文献   

5.
Fitch WT 《Current biology : CB》2011,21(14):R543-R546
A language-trained chimpanzee is able to interpret synthetic 'auditory caricatures' as speech. Important components of human speech perception thus rely upon general auditory mechanisms that predated the evolution of spoken language.  相似文献   

6.
BACKGROUND: Difficulty in speech understanding in the presence of background noise or competing auditory signals is typically present in central auditory processing disorders. These disorders may be diagnosed in Alzheimer's disease as a result of degeneration in the central auditory system. In addition perception and processing of speech may be affected. MATERIAL AND METHODS: A MEDLINE research was conducted in order to answer the question whether there is a central auditory processing disorder involved in Alzheimer's disease. A second question to be investigated was what, if any is the connection, between central auditory processing disorders and speech deterioration?Articles were retrieved from the Medline to find relevance of Alzheimer's dis ease with central auditory processing disorders, they summed up to 34. Twelve papers were studied that contained testing for CAPD through psychoacoustic investigation. An additional search using the keywords 'speech production' and 'AD' produced a result of 33 articles, of them 14 are thoroughly discussed in this review as they have references concerning CAPD. The rest do not contain any relavent information on the central auditory system. RESULTS: Psychoacoustic tests reveal significantly lower scores in patients with Alzheimer's disease compared with normal subjects. Tests concerning sound localization and perception of tones as well as phoneme discrimination and tonal memory reveal deficits in Alzheimer's disease. Central auditory processing disorders may exist several years before the onset of clinical diagnosis of Alzheimer's disease. Segmental characteristics of speech are normal. Deficits exist concerning the supra-segmental components of speech. CONCLUSIONS: Central auditory processing disorders have been found in many cases when patients with Alzheimer's disease are tested. They may present as an early manifestation of Alzheimer's disease, preceding the disease by a minimum of 5 and a maximum of 10 years. During these years changes in the central auditory system, starting in the temporal lobe, may produce deficits in speech processing and production as hearing and speech are highly connected human functions. Another theory may be that spread of degeneration of the central nervous system has as a consequence, speech deterioration. Further research and central auditory processing disorders testing in the elderly population are needed to validate one theory over the other.  相似文献   

7.
The multiple-channel cochlear implant is the first sensori-neural prosthesis to effectively and safely bring electronic technology into a direct physiological relation with the central nervous system and human consciousness, and to give speech perception to severely-profoundly deaf people and spoken language to children. Research showed that the place and temporal coding of sound frequencies could be partly replicated by multiple-channel stimulation of the auditory nerve. This required safety studies on how to prevent the effects to the cochlea of trauma, electrical stimuli, biomaterials and middle ear infection. The mechanical properties of an array and mode of stimulation for the place coding of speech frequencies were determined. A fully implantable receiver-stimulator was developed, as well as the procedures for the clinical assessment of deaf people, and the surgical placement of the device. The perception of electrically coded sounds was determined, and a speech processing strategy discovered that enabled late-deafened adults to comprehend running speech. The brain processing systems for patterns of electrical stimuli reproducing speech were elucidated. The research was developed industrially, and improvements in speech processing made through presenting additional speech frequencies by place coding. Finally, the importance of the multiple-channel cochlear implant for early deafened children was established.  相似文献   

8.
Rhythm is important in the production of motor sequences such as speech and song. Deficits in rhythm processing have been implicated in human disorders that affect speech and language processing, including stuttering, autism, and dyslexia. Songbirds provide a tractable model for studying the neural underpinnings of rhythm processing due to parallels with humans in neural structures and vocal learning patterns. In this study, adult zebra finches were exposed to naturally rhythmic conspecific song or arrhythmic song. Immunohistochemistry for the immediate early gene ZENK was used to detect neural activation in response to these two types of stimuli. ZENK was increased in response to arrhythmic song in the auditory association cortex homologs, caudomedial nidopallium (NCM) and caudomedial mesopallium (CMM), and the avian amygdala, nucleus taeniae (Tn). CMM also had greater ZENK labeling in females than males. The increased neural activity in NCM and CMM during perception of arrhythmic stimuli parallels increased activity in the human auditory cortex following exposure to unexpected, or perturbed, auditory stimuli. These auditory areas may be detecting errors in arrhythmic song when comparing it to a stored template of how conspecific song is expected to sound. CMM may also be important for females in evaluating songs of potential mates. In the context of other research in songbirds, we suggest that the increased activity in Tn may be related to the value of song for assessing mate choice and bonding or it may be related to perception of arrhythmic song as aversive.  相似文献   

9.
Pitch perception is important for understanding speech prosody, music perception, recognizing tones in tonal languages, and perceiving speech in noisy environments. The two principal pitch perception theories consider the place of maximum neural excitation along the auditory nerve and the temporal pattern of the auditory neurons’ action potentials (spikes) as pitch cues. This paper describes a biophysical mechanism by which fine-structure temporal information can be extracted from the spikes generated at the auditory periphery. Deriving meaningful pitch-related information from spike times requires neural structures specialized in capturing synchronous or correlated activity from amongst neural events. The emergence of such pitch-processing neural mechanisms is described through a computational model of auditory processing. Simulation results show that a correlation-based, unsupervised, spike-based form of Hebbian learning can explain the development of neural structures required for recognizing the pitch of simple and complex tones, with or without the fundamental frequency. The temporal code is robust to variations in the spectral shape of the signal and thus can explain the phenomenon of pitch constancy.  相似文献   

10.
本文提出了一种新型有效的基于听觉模型的基音提取方法.它主要是模拟人类听觉系统音调感知功能,在跨通道累加自相关处理方法的基础上,增加了神经系统在感知时的时间连续性模拟,由于对空间和时间分布的信息的综合累加作用,使所提的方法不仅能提取出淹没在各种噪音下的语言信号的基音信息,而且能够判断所处理信号是否由重叠语言信号构成,并进一步提取出叠在一起的独立的基音信息.初步的实验结果证明了所提模型的有效性.  相似文献   

11.
In this paper, we describe domain-general auditory processes that we believe are prerequisite to the linguistic analysis of speech. We discuss biological evidence for these processes and how they might relate to processes that are specific to human speech and language. We begin with a brief review of (i) the anatomy of the auditory system and (ii) the essential properties of speech sounds. Section 4 describes the general auditory mechanisms that we believe are applied to all communication sounds, and how functional neuroimaging is being used to map the brain networks associated with domain-general auditory processing. Section 5 discusses recent neuroimaging studies that explore where such general processes give way to those that are specific to human speech and language.  相似文献   

12.
There have been recent developments in our understanding of the auditory neuroscience of non-human primates that, to a certain extent, can be integrated with findings from human functional neuroimaging studies. This framework can be used to consider the cortical basis of complex sound processing in humans, including implications for speech perception, spatial auditory processing and auditory scene segregation.  相似文献   

13.
Zhang L  Xi J  Xu G  Shu H  Wang X  Li P 《PloS one》2011,6(6):e20963
In speech perception, a functional hierarchy has been proposed by recent functional neuroimaging studies: core auditory areas on the dorsal plane of superior temporal gyrus (STG) are sensitive to basic acoustic characteristics, whereas downstream regions, specifically the left superior temporal sulcus (STS) and middle temporal gyrus (MTG) ventral to Heschl's gyrus (HG) are responsive to abstract phonological features. What is unclear so far is the relationship between the dorsal and ventral processes, especially with regard to whether low-level acoustic processing is modulated by high-level phonological processing. To address the issue, we assessed sensitivity of core auditory and downstream regions to acoustic and phonological variations by using within- and across-category lexical tonal continua with equal physical intervals. We found that relative to within-category variation, across-category variation elicited stronger activation in the left middle MTG (mMTG), apparently reflecting the abstract phonological representations. At the same time, activation in the core auditory region decreased, resulting from the top-down influences of phonological processing. These results support a hierarchical organization of the ventral acoustic-phonological processing stream, which originates in the right HG/STG and projects to the left mMTG. Furthermore, our study provides direct evidence that low-level acoustic analysis is modulated by high-level phonological representations, revealing the cortical dynamics of acoustic and phonological processing in speech perception. Our findings confirm the existence of reciprocal progression projections in the auditory pathways and the roles of both feed-forward and feedback mechanisms in speech perception.  相似文献   

14.
The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener’s native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory.  相似文献   

15.
Prelingually deafened children with cochlear implants stand a good chance of developing satisfactory speech performance. Nevertheless, their eventual language performance is highly variable and not fully explainable by the duration of deafness and hearing experience. In this study, two groups of cochlear implant users (CI groups) with very good basic hearing abilities but non-overlapping speech performance (very good or very bad speech performance) were matched according to hearing age and age at implantation. We assessed whether these CI groups differed with regard to their phoneme discrimination ability and auditory sensory memory capacity, as suggested by earlier studies. These functions were measured behaviorally and with the Mismatch Negativity (MMN). Phoneme discrimination ability was comparable in the CI group of good performers and matched healthy controls, which were both better than the bad performers. Source analyses revealed larger MMN activity (155–225 ms) in good than in bad performers, which was generated in the frontal cortex and positively correlated with measures of working memory. For the bad performers, this was followed by an increased activation of left temporal regions from 225 to 250 ms with a focus on the auditory cortex. These results indicate that the two CI groups developed different auditory speech processing strategies and stress the role of phonological functions of auditory sensory memory and the prefrontal cortex in positively developing speech perception and production.  相似文献   

16.
Research on the neural basis of speech-reading implicates a network of auditory language regions involving inferior frontal cortex, premotor cortex and sites along superior temporal cortex. In audiovisual speech studies, neural activity is consistently reported in posterior superior temporal Sulcus (pSTS) and this site has been implicated in multimodal integration. Traditionally, multisensory interactions are considered high-level processing that engages heteromodal association cortices (such as STS). Recent work, however, challenges this notion and suggests that multisensory interactions may occur in low-level unimodal sensory cortices. While previous audiovisual speech studies demonstrate that high-level multisensory interactions occur in pSTS, what remains unclear is how early in the processing hierarchy these multisensory interactions may occur. The goal of the present fMRI experiment is to investigate how visual speech can influence activity in auditory cortex above and beyond its response to auditory speech. In an audiovisual speech experiment, subjects were presented with auditory speech with and without congruent visual input. Holding the auditory stimulus constant across the experiment, we investigated how the addition of visual speech influences activity in auditory cortex. We demonstrate that congruent visual speech increases the activity in auditory cortex.  相似文献   

17.
Liu F  Jiang C  Thompson WF  Xu Y  Yang Y  Stewart L 《PloS one》2012,7(2):e30374
Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.  相似文献   

18.
Auditory information is processed in a fine-to-crude hierarchical scheme, from low-level acoustic information to high-level abstract representations, such as phonological labels. We now ask whether fine acoustic information, which is not retained at high levels, can still be used to extract speech from noise. Previous theories suggested either full availability of low-level information or availability that is limited by task difficulty. We propose a third alternative, based on the Reverse Hierarchy Theory (RHT), originally derived to describe the relations between the processing hierarchy and visual perception. RHT asserts that only the higher levels of the hierarchy are immediately available for perception. Direct access to low-level information requires specific conditions, and can be achieved only at the cost of concurrent comprehension. We tested the predictions of these three views in a series of experiments in which we measured the benefits from utilizing low-level binaural information for speech perception, and compared it to that predicted from a model of the early auditory system. Only auditory RHT could account for the full pattern of the results, suggesting that similar defaults and tradeoffs underlie the relations between hierarchical processing and perception in the visual and auditory modalities.  相似文献   

19.
Auditory information is processed in a fine-to-crude hierarchical scheme, from low-level acoustic information to high-level abstract representations, such as phonological labels. We now ask whether fine acoustic information, which is not retained at high levels, can still be used to extract speech from noise. Previous theories suggested either full availability of low-level information or availability that is limited by task difficulty. We propose a third alternative, based on the Reverse Hierarchy Theory (RHT), originally derived to describe the relations between the processing hierarchy and visual perception. RHT asserts that only the higher levels of the hierarchy are immediately available for perception. Direct access to low-level information requires specific conditions, and can be achieved only at the cost of concurrent comprehension. We tested the predictions of these three views in a series of experiments in which we measured the benefits from utilizing low-level binaural information for speech perception, and compared it to that predicted from a model of the early auditory system. Only auditory RHT could account for the full pattern of the results, suggesting that similar defaults and tradeoffs underlie the relations between hierarchical processing and perception in the visual and auditory modalities.  相似文献   

20.
Infants' speech perception skills show a dual change towards the end of the first year of life. Not only does non-native speech perception decline, as often shown, but native language speech perception skills show improvement, reflecting a facilitative effect of experience with native language. The mechanism underlying change at this point in development, and the relationship between the change in native and non-native speech perception, is of theoretical interest. As shown in new data presented here, at the cusp of this developmental change, infants' native and non-native phonetic perception skills predict later language ability, but in opposite directions. Better native language skill at 7.5 months of age predicts faster language advancement, whereas better non-native language skill predicts slower advancement. We suggest that native language phonetic performance is indicative of neural commitment to the native language, while non-native phonetic performance reveals uncommitted neural circuitry. This paper has three goals: (i) to review existing models of phonetic perception development, (ii) to present new event-related potential data showing that native and non-native phonetic perception at 7.5 months of age predicts language growth over the next 2 years, and (iii) to describe a revised version of our previous model, the native language magnet model, expanded (NLM-e). NLM-e incorporates five new principles. Specific testable predictions for future research programmes are described.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号