首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Zhang L  Xi J  Xu G  Shu H  Wang X  Li P 《PloS one》2011,6(6):e20963
In speech perception, a functional hierarchy has been proposed by recent functional neuroimaging studies: core auditory areas on the dorsal plane of superior temporal gyrus (STG) are sensitive to basic acoustic characteristics, whereas downstream regions, specifically the left superior temporal sulcus (STS) and middle temporal gyrus (MTG) ventral to Heschl's gyrus (HG) are responsive to abstract phonological features. What is unclear so far is the relationship between the dorsal and ventral processes, especially with regard to whether low-level acoustic processing is modulated by high-level phonological processing. To address the issue, we assessed sensitivity of core auditory and downstream regions to acoustic and phonological variations by using within- and across-category lexical tonal continua with equal physical intervals. We found that relative to within-category variation, across-category variation elicited stronger activation in the left middle MTG (mMTG), apparently reflecting the abstract phonological representations. At the same time, activation in the core auditory region decreased, resulting from the top-down influences of phonological processing. These results support a hierarchical organization of the ventral acoustic-phonological processing stream, which originates in the right HG/STG and projects to the left mMTG. Furthermore, our study provides direct evidence that low-level acoustic analysis is modulated by high-level phonological representations, revealing the cortical dynamics of acoustic and phonological processing in speech perception. Our findings confirm the existence of reciprocal progression projections in the auditory pathways and the roles of both feed-forward and feedback mechanisms in speech perception.  相似文献   

2.
Speech perception at the interface of neurobiology and linguistics   总被引:2,自引:0,他引:2  
Speech perception consists of a set of computations that take continuously varying acoustic waveforms as input and generate discrete representations that make contact with the lexical representations stored in long-term memory as output. Because the perceptual objects that are recognized by the speech perception enter into subsequent linguistic computation, the format that is used for lexical representation and processing fundamentally constrains the speech perceptual processes. Consequently, theories of speech perception must, at some level, be tightly linked to theories of lexical representation. Minimally, speech perception must yield representations that smoothly and rapidly interface with stored lexical items. Adopting the perspective of Marr, we argue and provide neurobiological and psychophysical evidence for the following research programme. First, at the implementational level, speech perception is a multi-time resolution process, with perceptual analyses occurring concurrently on at least two time scales (approx. 20-80 ms, approx. 150-300 ms), commensurate with (sub)segmental and syllabic analyses, respectively. Second, at the algorithmic level, we suggest that perception proceeds on the basis of internal forward models, or uses an 'analysis-by-synthesis' approach. Third, at the computational level (in the sense of Marr), the theory of lexical representation that we adopt is principally informed by phonological research and assumes that words are represented in the mental lexicon in terms of sequences of discrete segments composed of distinctive features. One important goal of the research programme is to develop linking hypotheses between putative neurobiological primitives (e.g. temporal primitives) and those primitives derived from linguistic inquiry, to arrive ultimately at a biologically sensible and theoretically satisfying model of representation and computation in speech.  相似文献   

3.
Established linguistic theoretical frameworks propose that alphabetic language speakers use phonemes as phonological encoding units during speech production whereas Mandarin Chinese speakers use syllables. This framework was challenged by recent neural evidence of facilitation induced by overlapping initial phonemes, raising the possibility that phonemes also contribute to the phonological encoding process in Chinese. However, there is no evidence of non-initial phoneme involvement in Chinese phonological encoding among representative Chinese speakers, rendering the functional role of phonemes in spoken Chinese controversial. Here, we addressed this issue by systematically investigating the word-initial and non-initial phoneme repetition effect on the electrophysiological signal using a picture-naming priming task in which native Chinese speakers produced disyllabic word pairs. We found that overlapping phonemes in both the initial and non-initial position evoked more positive ERPs in the 180- to 300-ms interval, indicating position-invariant repetition facilitation effect during phonological encoding. Our findings thus revealed the fundamental role of phonemes as independent phonological encoding units in Mandarin Chinese.  相似文献   

4.
Reaction time and recognition accuracy of speech emotional intonations in short meaningless words that differed only in one phoneme with background noise and without it were studied in 49 adults of 20-79 years old. The results were compared with the same parameters of emotional intonations in intelligent speech utterances under similar conditions. Perception of emotional intonations at different linguistic levels (phonological and lexico-semantic) was found to have both common features and certain peculiarities. Recognition characteristics of emotional intonations depending on gender and age of listeners appeared to be invariant with regard to linguistic levels of speech stimuli. Phonemic composition of pseudowords was found to influence the emotional perception, especially against the background noise. The most significant stimuli acoustic characteristic responsible for the perception of speech emotional prosody in short meaningless words under the two experimental conditions, i.e. with and without background noise, was the fundamental frequency variation.  相似文献   

5.
Is speech rhythmic? In the absence of evidence for a traditional view that languages strive to coordinate either syllables or stress-feet with regular time intervals, we consider the alternative that languages exhibit contrastive rhythm subsisting merely in the alternation of stronger and weaker elements. This is initially plausible, particularly for languages with a steep ‘prominence gradient’, i.e. a large disparity between stronger and weaker elements; but we point out that alternation is poorly achieved even by a ‘stress-timed’ language such as English, and, historically, languages have conspicuously failed to adopt simple phonological remedies that would ensure alternation. Languages seem more concerned to allow ‘syntagmatic contrast’ between successive units and to use durational effects to support linguistic functions than to facilitate rhythm. Furthermore, some languages (e.g. Tamil, Korean) lack the lexical prominence which would most straightforwardly underpin prominence of alternation. We conclude that speech is not incontestibly rhythmic, and may even be antirhythmic. However, its linguistic structure and patterning allow the metaphorical extension of rhythm in varying degrees and in different ways depending on the language, and it is this analogical process which allows speech to be matched to external rhythms.  相似文献   

6.
The present article outlines the contribution of the mismatch negativity (MMN), and its magnetic equivalent MMNm, to our understanding of the perception of speech sounds in the human brain. MMN data indicate that each sound, both speech and non-speech, develops its neural representation corresponding to the percept of this sound in the neurophysiological substrate of auditory sensory memory. The accuracy of this representation, determining the accuracy of the discrimination between different sounds, can be probed with MMN separately for any auditory feature or stimulus type such as phonemes. Furthermore, MMN data show that the perception of phonemes, and probably also of larger linguistic units (syllables and words), is based on language-specific phonetic traces developed in the posterior part of the left-hemisphere auditory cortex. These traces serve as recognition models for the corresponding speech sounds in listening to speech.  相似文献   

7.
Speech-sound disorder (SSD) is a complex behavioral disorder characterized by speech-sound production errors associated with deficits in articulation, phonological processes, and cognitive linguistic processes. SSD is prevalent in childhood and is comorbid with disorders of language, spelling, and reading disability, or dyslexia. Previous research suggests that developmental problems in domains associated with speech and language acquisition place a child at risk for dyslexia. Recent genetic studies have identified several candidate regions for dyslexia, including one on chromosome 3 segregating in a large Finnish pedigree. To explore common genetic influences on SSD and reading, we examined linkage for several quantitative traits to markers in the pericentrometric region of chromosome 3 in 77 families ascertained through a child with SSD. The quantitative scores measured several processes underlying speech-sound production, including phonological memory, phonological representation, articulation, receptive and expressive vocabulary, and reading decoding and comprehension skills. Model-free linkage analysis was followed by identification of sib pairs with linkage and construction of core shared haplotypes. In our multipoint analyses, measures of phonological memory demonstrated the strongest linkage (marker D3S2465, P=5.6 x 10(-5), and marker D3S3716, P=6.8 x 10(-4)). Tests for single-word decoding also demonstrated linkage (real word reading: marker D3S2465, P=.004; nonsense word reading: marker D3S1595, P=.005). The minimum shared haplotype in sib pairs with similar trait values spans 4.9 cM and is bounded by markers D3S3049 and D3S3045. Our results suggest that domains common to SSD and dyslexia are pleiotropically influenced by a putative quantitative trait locus on chromosome 3.  相似文献   

8.
We tested the hypothesis that the categorical perception deficit of speech sounds in developmental dyslexia is related to phoneme awareness skills, whereas a visual attention (VA) span deficit constitutes an independent deficit. Phoneme awareness tasks, VA span tasks and categorical perception tasks of phoneme identification and discrimination using a d/t voicing continuum were administered to 63 dyslexic children and 63 control children matched on chronological age. Results showed significant differences in categorical perception between the dyslexic and control children. Significant correlations were found between categorical perception skills, phoneme awareness and reading. Although VA span correlated with reading, no significant correlations were found between either categorical perception or phoneme awareness and VA span. Mediation analyses performed on the whole dyslexic sample suggested that the effect of categorical perception on reading might be mediated by phoneme awareness. This relationship was independent of the participants’ VA span abilities. Two groups of dyslexic children with a single phoneme awareness or a single VA span deficit were then identified. The phonologically impaired group showed lower categorical perception skills than the control group but categorical perception was similar in the VA span impaired dyslexic and control children. The overall findings suggest that the link between categorical perception, phoneme awareness and reading is independent from VA span skills. These findings provide new insights on the heterogeneity of developmental dyslexia. They suggest that phonological processes and VA span independently affect reading acquisition.  相似文献   

9.
The present study investigated the effects of sequence complexity, defined in terms of phonemic similarity and phonotoactic probability, on the timing and accuracy of serial ordering for speech production in healthy speakers and speakers with either hypokinetic or ataxic dysarthria. Sequences were comprised of strings of consonant-vowel (CV) syllables with each syllable containing the same vowel, /a/, paired with a different consonant. High complexity sequences contained phonemically similar consonants, and sounds and syllables that had low phonotactic probabilities; low complexity sequences contained phonemically dissimilar consonants and high probability sounds and syllables. Sequence complexity effects were evaluated by analyzing speech error rates and within-syllable vowel and pause durations. This analysis revealed that speech error rates were significantly higher and speech duration measures were significantly longer during production of high complexity sequences than during production of low complexity sequences. Although speakers with dysarthria produced longer overall speech durations than healthy speakers, the effects of sequence complexity on error rates and speech durations were comparable across all groups. These findings indicate that the duration and accuracy of processes for selecting items in a speech sequence is influenced by their phonemic similarity and/or phonotactic probability. Moreover, this robust complexity effect is present even in speakers with damage to subcortical circuits involved in serial control for speech.  相似文献   

10.
Azadpour M  Balaban E 《PloS one》2008,3(4):e1966
Neuroimaging studies of speech processing increasingly rely on artificial speech-like sounds whose perceptual status as speech or non-speech is assigned by simple subjective judgments; brain activation patterns are interpreted according to these status assignments. The naïve perceptual status of one such stimulus, spectrally-rotated speech (not consciously perceived as speech by naïve subjects), was evaluated in discrimination and forced identification experiments. Discrimination of variation in spectrally-rotated syllables in one group of naïve subjects was strongly related to the pattern of similarities in phonological identification of the same stimuli provided by a second, independent group of naïve subjects, suggesting either that (1) naïve rotated syllable perception involves phonetic-like processing, or (2) that perception is solely based on physical acoustic similarity, and similar sounds are provided with similar phonetic identities. Analysis of acoustic (Euclidean distances of center frequency values of formants) and phonetic similarities in the perception of the vowel portions of the rotated syllables revealed that discrimination was significantly and independently influenced by both acoustic and phonological information. We conclude that simple subjective assessments of artificial speech-like sounds can be misleading, as perception of such sounds may initially and unconsciously utilize speech-like, phonological processing.  相似文献   

11.
Understanding foreign speech is difficult, in part because of unusual mappings between sounds and words. It is known that listeners in their native language can use lexical knowledge (about how words ought to sound) to learn how to interpret unusual speech-sounds. We therefore investigated whether subtitles, which provide lexical information, support perceptual learning about foreign speech. Dutch participants, unfamiliar with Scottish and Australian regional accents of English, watched Scottish or Australian English videos with Dutch, English or no subtitles, and then repeated audio fragments of both accents. Repetition of novel fragments was worse after Dutch-subtitle exposure but better after English-subtitle exposure. Native-language subtitles appear to create lexical interference, but foreign-language subtitles assist speech learning by indicating which words (and hence sounds) are being spoken.  相似文献   

12.
In this paper, we describe domain-general auditory processes that we believe are prerequisite to the linguistic analysis of speech. We discuss biological evidence for these processes and how they might relate to processes that are specific to human speech and language. We begin with a brief review of (i) the anatomy of the auditory system and (ii) the essential properties of speech sounds. Section 4 describes the general auditory mechanisms that we believe are applied to all communication sounds, and how functional neuroimaging is being used to map the brain networks associated with domain-general auditory processing. Section 5 discusses recent neuroimaging studies that explore where such general processes give way to those that are specific to human speech and language.  相似文献   

13.
Extensive research shows that inter-talker variability (i.e., changing the talker) affects recognition memory for speech signals. However, relatively little is known about the consequences of intra-talker variability (i.e. changes in speaking style within a talker) on the encoding of speech signals in memory. It is well established that speakers can modulate the characteristics of their own speech and produce a listener-oriented, intelligibility-enhancing speaking style in response to communication demands (e.g., when speaking to listeners with hearing impairment or non-native speakers of the language). Here we conducted two experiments to examine the role of speaking style variation in spoken language processing. First, we examined the extent to which clear speech provided benefits in challenging listening environments (i.e. speech-in-noise). Second, we compared recognition memory for sentences produced in conversational and clear speaking styles. In both experiments, semantically normal and anomalous sentences were included to investigate the role of higher-level linguistic information in the processing of speaking style variability. The results show that acoustic-phonetic modifications implemented in listener-oriented speech lead to improved speech recognition in challenging listening conditions and, crucially, to a substantial enhancement in recognition memory for sentences.  相似文献   

14.
Humans can recognize spoken words with unmatched speed and accuracy. Hearing the initial portion of a word such as "formu…" is sufficient for the brain to identify "formula" from the thousands of other words that partially match. Two alternative computational accounts propose that partially matching words (1) inhibit each other until a single word is selected ("formula" inhibits "formal" by lexical competition) or (2) are used to predict upcoming speech sounds more accurately (segment prediction error is minimal after sequences like "formu…"). To distinguish these theories we taught participants novel words (e.g., "formubo") that sound like existing words ("formula") on two successive days. Computational simulations show that knowing "formubo" increases lexical competition when hearing "formu…", but reduces segment prediction error. Conversely, when the sounds in "formula" and "formubo" diverge, the reverse is observed. The time course of magnetoencephalographic brain responses in the superior temporal gyrus (STG) is uniquely consistent with a segment prediction account. We propose a predictive coding model of spoken word recognition in which STG neurons represent the difference between predicted and heard speech sounds. This prediction error signal explains the efficiency of human word recognition and simulates neural responses in auditory regions.  相似文献   

15.
The ability to map speech sounds to corresponding letters is critical for establishing proficient reading. People vary in this phonological processing ability, which has been hypothesized to result from variation in hemispheric asymmetries within brain regions that support language. A cerebral lateralization hypothesis predicts that more asymmetric brain structures facilitate the development of foundational reading skills like phonological processing. That is, structural asymmetries are predicted to linearly increase with ability. In contrast, a canalization hypothesis predicts that asymmetries constrain behavioral performance within a normal range. That is, structural asymmetries are predicted to quadratically relate to phonological processing, with average phonological processing occurring in people with the most asymmetric structures. These predictions were examined in relatively large samples of children (N = 424) and adults (N = 300), using a topological asymmetry analysis of T1-weighted brain images and a decoding measure of phonological processing. There was limited evidence of structural asymmetry and phonological decoding associations in classic language-related brain regions. However, and in modest support of the cerebral lateralization hypothesis, small to medium effect sizes were observed where phonological decoding accuracy increased with the magnitude of the largest structural asymmetry across left hemisphere cortical regions, but not right hemisphere cortical regions, for both the adult and pediatric samples. In support of the canalization hypothesis, small to medium effect sizes were observed where phonological decoding in the normal range was associated with increased asymmetries in specific cortical regions for both the adult and pediatric samples, which included performance monitoring and motor planning brain regions that contribute to oral and written language functions. Thus, the relevance of each hypothesis to phonological decoding may depend on the scale of brain organization.  相似文献   

16.
In utero RNAi of the dyslexia-associated gene Kiaa0319 in rats (KIA-) degrades cortical responses to speech sounds and increases trial-by-trial variability in onset latency. We tested the hypothesis that KIA- rats would be impaired at speech sound discrimination. KIA- rats needed twice as much training in quiet conditions to perform at control levels and remained impaired at several speech tasks. Focused training using truncated speech sounds was able to normalize speech discrimination in quiet and background noise conditions. Training also normalized trial-by-trial neural variability and temporal phase locking. Cortical activity from speech trained KIA- rats was sufficient to accurately discriminate between similar consonant sounds. These results provide the first direct evidence that assumed reduced expression of the dyslexia-associated gene KIAA0319 can cause phoneme processing impairments similar to those seen in dyslexia and that intensive behavioral therapy can eliminate these impairments.  相似文献   

17.
Relations among linguistic auditory processing, nonlinguistic auditory processing, spelling ability, and spelling strategy choice were examined. Sixty-three undergraduate students completed measures of auditory processing (one involving distinguishing similar tones, one involving distinguishing similar phonemes, and one involving selecting appropriate spellings for individual phonemes). Participants also completed a modified version of a standardized spelling test, and a secondary spelling test with retrospective strategy reports. Once testing was completed, participants were divided into phonological versus nonphonological spellers on the basis of the number of words they spelled using phonological strategies only. Results indicated a) moderate to strong positive correlations among the different auditory processing tasks in terms of reaction time, but not accuracy levels, and b) weak to moderate positive correlations between measures of linguistic auditory processing (phoneme distinction and phoneme spelling choice in the presence of foils) and spelling ability for phonological spellers, but not for nonphonological spellers. These results suggest a possible explanation for past contradictory research on auditory processing and spelling, which has been divided in terms of whether or not disabled spellers seemed to have poorer auditory processing than did typically developing spellers, and suggest implications for teaching spelling to children with good versus poor auditory processing abilities.  相似文献   

18.
Sound symbolism is a non-arbitrary relationship between speech sounds and meaning. We review evidence that, contrary to the traditional view in linguistics, sound symbolism is an important design feature of language, which affects online processing of language, and most importantly, language acquisition. We propose the sound symbolism bootstrapping hypothesis, claiming that (i) pre-verbal infants are sensitive to sound symbolism, due to a biologically endowed ability to map and integrate multi-modal input, (ii) sound symbolism helps infants gain referential insight for speech sounds, (iii) sound symbolism helps infants and toddlers associate speech sounds with their referents to establish a lexical representation and (iv) sound symbolism helps toddlers learn words by allowing them to focus on referents embedded in a complex scene, alleviating Quine''s problem. We further explore the possibility that sound symbolism is deeply related to language evolution, drawing the parallel between historical development of language across generations and ontogenetic development within individuals. Finally, we suggest that sound symbolism bootstrapping is a part of a more general phenomenon of bootstrapping by means of iconic representations, drawing on similarities and close behavioural links between sound symbolism and speech-accompanying iconic gesture.  相似文献   

19.
In the present study, we used transcranial magnetic stimulation (TMS) to investigate the influence of phonological and lexical properties of verbal items on the excitability of the tongue's cortical motor representation during passive listening. In particular, we aimed to clarify if the difference in tongue motor excitability found during listening to words and pseudo-words [Fadiga, L., Craighero, L., Buccino, G., Rizzolatti, G., 2002. Speech listening specifically modulates the excitability of tongue muscles: a TMS study. European Journal of Neuroscience 15, 399-402] is due to lexical frequency or to the presence of a meaning per se. In order to do this, we investigated the time-course of tongue motor-evoked potentials (MEPs) during listening to frequent words, rare words, and pseudo-words embedded with a double consonant requiring relevant tongue movements for its pronunciation. Results showed that at the later stimulation intervals (200 and 300 ms from the double consonant) listening to rare words evoked much larger MEPs than listening to frequent words. Moreover, by comparing pseudo-words embedded with a double consonant requiring or not tongue movements, we found that a pure phonological motor resonance was present only 100 ms after the double consonant. Thus, while the phonological motor resonance appears very early, the lexical-dependent motor facilitation takes more time to appear and depends on the frequency of the stimuli. The present results indicate that the motor system responsible for phonoarticulatory movements during speech production is also involved during speech listening in a strictly specific way. This motor facilitation reflects both the difference in the phonoarticulatory characteristics and the difference in the frequency of occurrence of the verbal material.  相似文献   

20.
The precise neural mechanisms underlying speech sound representations are still a matter of debate. Proponents of 'sparse representations' assume that on the level of speech sounds, only contrastive or otherwise not predictable information is stored in long-term memory. Here, in a passive oddball paradigm, we challenge the neural foundations of such a 'sparse' representation; we use words that differ only in their penultimate consonant ("coronal" [t] vs. "dorsal" [k] place of articulation) and for example distinguish between the German nouns Latz ([lats]; bib) and Lachs ([laks]; salmon). Changes from standard [t] to deviant [k] and vice versa elicited a discernible Mismatch Negativity (MMN) response. Crucially, however, the MMN for the deviant [lats] was stronger than the MMN for the deviant [laks]. Source localization showed this difference to be due to enhanced brain activity in right superior temporal cortex. These findings reflect a difference in phonological 'sparsity': Coronal [t] segments, but not dorsal [k] segments, are based on more sparse representations and elicit less specific neural predictions; sensory deviations from this prediction are more readily 'tolerated' and accordingly trigger weaker MMNs. The results foster the neurocomputational reality of 'representationally sparse' models of speech perception that are compatible with more general predictive mechanisms in auditory perception.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号