期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Functional imaging of the auditory processing applied to speech sounds

Patterson RD Johnsrude IS 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2008,363(1493):1023-1035

In this paper, we describe domain-general auditory processes that we believe are prerequisite to the linguistic analysis of speech. We discuss biological evidence for these processes and how they might relate to processes that are specific to human speech and language. We begin with a brief review of (i) the anatomy of the auditory system and (ii) the essential properties of speech sounds. Section 4 describes the general auditory mechanisms that we believe are applied to all communication sounds, and how functional neuroimaging is being used to map the brain networks associated with domain-general auditory processing. Section 5 discusses recent neuroimaging studies that explore where such general processes give way to those that are specific to human speech and language. 相似文献

2.

Dissociable Perceptual Effects of Visual Adaptation

Kai-Markus Müller Frieder Schillinger David H. Do David A. Leopold 《PloS one》2009,4(7)

Neurons in the visual cortex are responsive to the presentation of oriented and curved line segments, which are thought to act as primitives for the visual processing of shapes and objects. Prolonged adaptation to such stimuli gives rise to two related perceptual effects: a slow change in the appearance of the adapting stimulus (perceptual drift), and the distortion of subsequently presented test stimuli (adaptational aftereffects). Here we used a psychophysical nulling technique to dissociate and quantify these two classical observations in order to examine their underlying mechanisms and their relationship to one another. In agreement with previous work, we found that during adaptation horizontal and vertical straight lines serve as attractors for perceived orientation and curvature. However, the rate of perceptual drift for different stimuli was not predictive of the corresponding aftereffect magnitudes, indicating that the two perceptual effects are governed by distinct neural processes. Finally, the rate of perceptual drift for curved line segments did not depend on the spatial scale of the stimulus, suggesting that its mechanisms lie outside strictly retinotopic processing stages. These findings provide new evidence that the visual system relies on statistically salient intrinsic reference stimuli for the processing of visual patterns, and point to perceptual drift as an experimental window for studying the mechanisms of visual perception. 相似文献

3.

Neural Responses to Multimodal Ostensive Signals in 5-Month-Old Infants

Eugenio Parise Gergely Csibra 《PloS one》2013,8(8)

Infants'' sensitivity to ostensive signals, such as direct eye contact and infant-directed speech, is well documented in the literature. We investigated how infants interpret such signals by assessing common processing mechanisms devoted to them and by measuring neural responses to their compounds. In Experiment 1, we found that ostensive signals from different modalities display overlapping electrophysiological activity in 5-month-old infants, suggesting that these signals share neural processing mechanisms independently of their modality. In Experiment 2, we found that the activation to ostensive signals from different modalities is not additive to each other, but rather reflects the presence of ostension in either stimulus stream. These data support the thesis that ostensive signals obligatorily indicate to young infants that communication is directed to them. 相似文献

4.

Robust sound classification through the representation of similarity using response fields derived from stimuli during early experience

Coath M Denham SL 《Biological cybernetics》2005,93(1):22-30

Models of auditory processing, particularly of speech, face many difficulties. Included in these are variability among speakers, variability in speech rate, and robustness to moderate distortions such as time compression. We constructed a system based on ensembles of feature detectors derived from fragments of an onset-sensitive sound representation. This method is based on the idea of ‘spectro-temporal response fields’ and uses convolution to measure the degree of similarity through time between the feature detectors and the stimulus. The output from the ensemble was used to derive segmentation cues and patterns of response, which were used to train an artificial neural network (ANN) classifier. This allowed us to estimate a lower bound for the mutual information between the class of the input and the class of the output. Our results suggest that there is significant information in the output of our system, and that this is robust with respect to the exact choice of feature set, time compression in the stimulus, and speaker variation. In addition, the robustness to time compression in the stimulus has features in common with human psychophysics. Similar experiments using feature detectors derived from fragments of non-speech sounds performed less well. This result is interesting in the light of results showing aberrant cortical development in animals exposed to impoverished auditory environments during the developmental phase. 相似文献

5.

Amusia Results in Abnormal Brain Activity following Inappropriate Intonation during Speech Comprehension

C Jiang JP Hamm VK Lim IJ Kirk X Chen Y Yang 《PloS one》2012,7(7):e41411

Pitch processing is a critical ability on which humans' tonal musical experience depends, and which is also of paramount importance for decoding prosody in speech. Congenital amusia refers to deficits in the ability to properly process musical pitch, and recent evidence has suggested that this musical pitch disorder may impact upon the processing of speech sounds. Here we present the first electrophysiological evidence demonstrating that individuals with amusia who speak Mandarin Chinese are impaired in classifying prosody as appropriate or inappropriate during a speech comprehension task. When presented with inappropriate prosody stimuli, control participants elicited a larger P600 and smaller N100 relative to the appropriate condition. In contrast, amusics did not show significant differences between the appropriate and inappropriate conditions in either the N100 or the P600 component. This provides further evidence that the pitch perception deficits associated with amusia may also affect intonation processing during speech comprehension in those who speak a tonal language such as Mandarin, and suggests music and language share some cognitive and neural resources. 相似文献

6.

Encoding of Temporal Information by Timing,Rate, and Place in Cat Auditory Cortex

Kazuo Imaizumi Nicholas J. Priebe Tatyana O. Sharpee Steven W. Cheung Christoph E. Schreiner 《PloS one》2010,5(7)

A central goal in auditory neuroscience is to understand the neural coding of species-specific communication and human speech sounds. Low-rate repetitive sounds are elemental features of communication sounds, and core auditory cortical regions have been implicated in processing these information-bearing elements. Repetitive sounds could be encoded by at least three neural response properties: 1) the event-locked spike-timing precision, 2) the mean firing rate, and 3) the interspike interval (ISI). To determine how well these response aspects capture information about the repetition rate stimulus, we measured local group responses of cortical neurons in cat anterior auditory field (AAF) to click trains and calculated their mutual information based on these different codes. ISIs of the multiunit responses carried substantially higher information about low repetition rates than either spike-timing precision or firing rate. Combining firing rate and ISI codes was synergistic and captured modestly more repetition information. Spatial distribution analyses showed distinct local clustering properties for each encoding scheme for repetition information indicative of a place code. Diversity in local processing emphasis and distribution of different repetition rate codes across AAF may give rise to concurrent feed-forward processing streams that contribute differently to higher-order sound analysis. 相似文献

7.

Phonological representations are unconsciously used when processing complex, non-speech signals

Azadpour M Balaban E 《PloS one》2008,3(4):e1966

Neuroimaging studies of speech processing increasingly rely on artificial speech-like sounds whose perceptual status as speech or non-speech is assigned by simple subjective judgments; brain activation patterns are interpreted according to these status assignments. The naïve perceptual status of one such stimulus, spectrally-rotated speech (not consciously perceived as speech by naïve subjects), was evaluated in discrimination and forced identification experiments. Discrimination of variation in spectrally-rotated syllables in one group of naïve subjects was strongly related to the pattern of similarities in phonological identification of the same stimuli provided by a second, independent group of naïve subjects, suggesting either that (1) naïve rotated syllable perception involves phonetic-like processing, or (2) that perception is solely based on physical acoustic similarity, and similar sounds are provided with similar phonetic identities. Analysis of acoustic (Euclidean distances of center frequency values of formants) and phonetic similarities in the perception of the vowel portions of the rotated syllables revealed that discrimination was significantly and independently influenced by both acoustic and phonological information. We conclude that simple subjective assessments of artificial speech-like sounds can be misleading, as perception of such sounds may initially and unconsciously utilize speech-like, phonological processing. 相似文献

8.

Integration of letters and speech sounds in the human brain

van Atteveldt N Formisano E Goebel R Blomert L 《Neuron》2004,43(2):271-282

相似文献

9.

Envelope reconstruction of speech and music highlights stronger tracking of speech at low frequencies

Nathaniel J. Zuk Jeremy W. Murphy Richard B. Reilly Edmund C. Lalor 《PLoS computational biology》2021,17(9)

The human brain tracks amplitude fluctuations of both speech and music, which reflects acoustic processing in addition to the encoding of higher-order features and one’s cognitive state. Comparing neural tracking of speech and music envelopes can elucidate stimulus-general mechanisms, but direct comparisons are confounded by differences in their envelope spectra. Here, we use a novel method of frequency-constrained reconstruction of stimulus envelopes using EEG recorded during passive listening. We expected to see music reconstruction match speech in a narrow range of frequencies, but instead we found that speech was reconstructed better than music for all frequencies we examined. Additionally, models trained on all stimulus types performed as well or better than the stimulus-specific models at higher modulation frequencies, suggesting a common neural mechanism for tracking speech and music. However, speech envelope tracking at low frequencies, below 1 Hz, was associated with increased weighting over parietal channels, which was not present for the other stimuli. Our results highlight the importance of low-frequency speech tracking and suggest an origin from speech-specific processing in the brain. 相似文献

10.

Selective and Efficient Neural Coding of Communication Signals Depends on Early Acoustic and Social Environment

Noopur Amin Michael Gastpar Frédéric E. Theunissen 《PloS one》2013,8(4)

Previous research has shown that postnatal exposure to simple, synthetic sounds can affect the sound representation in the auditory cortex as reflected by changes in the tonotopic map or other relatively simple tuning properties, such as AM tuning. However, their functional implications for neural processing in the generation of ethologically-based perception remain unexplored. Here we examined the effects of noise-rearing and social isolation on the neural processing of communication sounds such as species-specific song, in the primary auditory cortex analog of adult zebra finches. Our electrophysiological recordings reveal that neural tuning to simple frequency-based synthetic sounds is initially established in all the laminae independent of patterned acoustic experience; however, we provide the first evidence that early exposure to patterned sound statistics, such as those found in native sounds, is required for the subsequent emergence of neural selectivity for complex vocalizations and for shaping neural spiking precision in superficial and deep cortical laminae, and for creating efficient neural representations of song and a less redundant ensemble code in all the laminae. Our study also provides the first causal evidence for ‘sparse coding’, such that when the statistics of the stimuli were changed during rearing, as in noise-rearing, that the sparse or optimal representation for species-specific vocalizations disappeared. Taken together, these results imply that a layer-specific differential development of the auditory cortex requires patterned acoustic input, and a specialized and robust sensory representation of complex communication sounds in the auditory cortex requires a rich acoustic and social environment. 相似文献

11.

Musical melody and speech intonation: singing a different tune

RJ Zatorre SR Baum 《PLoS biology》2012,10(7):e1001372

Music and speech are often cited as characteristically human forms of communication. Both share the features of hierarchical structure, complex sound systems, and sensorimotor sequencing demands, and both are used to convey and influence emotions, among other functions [1]. Both music and speech also prominently use acoustical frequency modulations, perceived as variations in pitch, as part of their communicative repertoire. Given these similarities, and the fact that pitch perception and production involve the same peripheral transduction system (cochlea) and the same production mechanism (vocal tract), it might be natural to assume that pitch processing in speech and music would also depend on the same underlying cognitive and neural mechanisms. In this essay we argue that the processing of pitch information differs significantly for speech and music; specifically, we suggest that there are two pitch-related processing systems, one for more coarse-grained, approximate analysis and one for more fine-grained accurate representation, and that the latter is unique to music. More broadly, this dissociation offers clues about the interface between sensory and motor systems, and highlights the idea that multiple processing streams are a ubiquitous feature of neuro-cognitive architectures. 相似文献

12.

Neural overlap in processing music and speech

Isabelle Peretz Dominique Vuvan Marie-élaine Lagrois Jorge L. Armony 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2015,370(1664)

Neural overlap in processing music and speech, as measured by the co-activation of brain regions in neuroimaging studies, may suggest that parts of the neural circuitries established for language may have been recycled during evolution for musicality, or vice versa that musicality served as a springboard for language emergence. Such a perspective has important implications for several topics of general interest besides evolutionary origins. For instance, neural overlap is an important premise for the possibility of music training to influence language acquisition and literacy. However, neural overlap in processing music and speech does not entail sharing neural circuitries. Neural separability between music and speech may occur in overlapping brain regions. In this paper, we review the evidence and outline the issues faced in interpreting such neural data, and argue that converging evidence from several methodologies is needed before neural overlap is taken as evidence of sharing. 相似文献

13.

Crystal T. Engineer Claudia A. Perez Ryan S. Carraway Kevin Q. Chang Jarod L. Roland Andrew M. Sloan Michael P. Kilgard 《PloS one》2013,8(10)

Humans and animals readily generalize previously learned knowledge to new situations. Determining similarity is critical for assigning category membership to a novel stimulus. We tested the hypothesis that category membership is initially encoded by the similarity of the activity pattern evoked by a novel stimulus to the patterns from known categories. We provide behavioral and neurophysiological evidence that activity patterns in primary auditory cortex contain sufficient information to explain behavioral categorization of novel speech sounds by rats. Our results suggest that category membership might be encoded by the similarity of the activity pattern evoked by a novel speech sound to the patterns evoked by known sounds. Categorization based on featureless pattern matching may represent a general neural mechanism for ensuring accurate generalization across sensory and cognitive systems. 相似文献

14.

Speech Sound Processing Deficits and Training-Induced Neural Plasticity in Rats with Dyslexia Gene Knockdown

Tracy M. Centanni Fuyi Chen Anne M. Booker Crystal T. Engineer Andrew M. Sloan Robert L. Rennaker Joseph J. LoTurco Michael P. Kilgard 《PloS one》2014,9(5)

In utero RNAi of the dyslexia-associated gene Kiaa0319 in rats (KIA-) degrades cortical responses to speech sounds and increases trial-by-trial variability in onset latency. We tested the hypothesis that KIA- rats would be impaired at speech sound discrimination. KIA- rats needed twice as much training in quiet conditions to perform at control levels and remained impaired at several speech tasks. Focused training using truncated speech sounds was able to normalize speech discrimination in quiet and background noise conditions. Training also normalized trial-by-trial neural variability and temporal phase locking. Cortical activity from speech trained KIA- rats was sufficient to accurately discriminate between similar consonant sounds. These results provide the first direct evidence that assumed reduced expression of the dyslexia-associated gene KIAA0319 can cause phoneme processing impairments similar to those seen in dyslexia and that intensive behavioral therapy can eliminate these impairments. 相似文献

15.

Bird speech perception and vocal production: a comparison with humans

Beckers GJ 《Human biology; an international record of research》2011,83(2):191-212

Research into speech perception by nonhuman animals can be crucially informative in assessing whether specific perceptual phenomena in humans have evolved to decode speech, or reflect more general traits. Birds share with humans not only the capacity to use complex vocalizations for communication but also many characteristics of its underlying developmental and mechanistic processes; thus, birds are a particularly interesting group for comparative study. This review first discusses commonalities between birds and humans in perception of speech sounds. Several psychoacoustic studies have shown striking parallels in seemingly speech-specific perceptual phenomena, such as categorical perception of voice-onset-time variation, categorization of consonants that lack phonetic invariance, and compensation for coarticulation. Such findings are often regarded as evidence for the idea that the objects of human speech perception are auditory or acoustic events rather than articulations. Next, I highlight recent research on the production side of avian communication that has revealed the existence of vocal tract filtering and articulation in bird species-specific vocalization, which has traditionally been considered a hallmark of human speech production. Together, findings in birds show that many of characteristics of human speech perception are not uniquely human but also that a comparative approach to the question of what are the objects of perception--articulatory or auditory events--requires careful consideration of species-specific vocal production mechanisms. 相似文献

16.

The mismatch negativity as an index of the perception of speech sounds by the human brain

Näätänen R 《Rossi?skii fiziologicheski? zhurnal imeni I.M. Sechenova / Rossi?skaia akademiia nauk》2000,86(11):1481-1501

The present article outlines the contribution of the mismatch negativity (MMN), and its magnetic equivalent MMNm, to our understanding of the perception of speech sounds in the human brain. MMN data indicate that each sound, both speech and non-speech, develops its neural representation corresponding to the percept of this sound in the neurophysiological substrate of auditory sensory memory. The accuracy of this representation, determining the accuracy of the discrimination between different sounds, can be probed with MMN separately for any auditory feature or stimulus type such as phonemes. Furthermore, MMN data show that the perception of phonemes, and probably also of larger linguistic units (syllables and words), is based on language-specific phonetic traces developed in the posterior part of the left-hemisphere auditory cortex. These traces serve as recognition models for the corresponding speech sounds in listening to speech. 相似文献

17.

Sustained neural rhythms reveal endogenous oscillations supporting speech perception

Sander van Bree Ediz Sohoglu Matthew H. Davis Benedikt Zoefel 《PLoS biology》2021,19(2)

Rhythmic sensory or electrical stimulation will produce rhythmic brain responses. These rhythmic responses are often interpreted as endogenous neural oscillations aligned (or “entrained”) to the stimulus rhythm. However, stimulus-aligned brain responses can also be explained as a sequence of evoked responses, which only appear regular due to the rhythmicity of the stimulus, without necessarily involving underlying neural oscillations. To distinguish evoked responses from true oscillatory activity, we tested whether rhythmic stimulation produces oscillatory responses which continue after the end of the stimulus. Such sustained effects provide evidence for true involvement of neural oscillations. In Experiment 1, we found that rhythmic intelligible, but not unintelligible speech produces oscillatory responses in magnetoencephalography (MEG) which outlast the stimulus at parietal sensors. In Experiment 2, we found that transcranial alternating current stimulation (tACS) leads to rhythmic fluctuations in speech perception outcomes after the end of electrical stimulation. We further report that the phase relation between electroencephalography (EEG) responses and rhythmic intelligible speech can predict the tACS phase that leads to most accurate speech perception. Together, we provide fundamental results for several lines of research—including neural entrainment and tACS—and reveal endogenous neural oscillations as a key underlying principle for speech perception.

Just as a child on a swing continues to move after the pushing stops, this study reveals similar entrained rhythmic echoes in brain activity after hearing speech and electrical brain stimulation; perturbation with tACS shows that these brain oscillations help listeners to understand speech. 相似文献

18.

Effects of culture on musical pitch perception

Wong PC Ciocca V Chan AH Ha LY Tan LH Peretz I 《PloS one》2012,7(4):e33424

The strong association between music and speech has been supported by recent research focusing on musicians' superior abilities in second language learning and neural encoding of foreign speech sounds. However, evidence for a double association--the influence of linguistic background on music pitch processing and disorders--remains elusive. Because languages differ in their usage of elements (e.g., pitch) that are also essential for music, a unique opportunity for examining such language-to-music associations comes from a cross-cultural (linguistic) comparison of congenital amusia, a neurogenetic disorder affecting the music (pitch and rhythm) processing of about 5% of the Western population. In the present study, two populations (Hong Kong and Canada) were compared. One spoke a tone language in which differences in voice pitch correspond to differences in word meaning (in Hong Kong Cantonese, /si/ means 'teacher' and 'to try' when spoken in a high and mid pitch pattern, respectively). Using the On-line Identification Test of Congenital Amusia, we found Cantonese speakers as a group tend to show enhanced pitch perception ability compared to speakers of Canadian French and English (non-tone languages). This enhanced ability occurs in the absence of differences in rhythmic perception and persists even after relevant factors such as musical background and age were controlled. Following a common definition of amusia (5% of the population), we found Hong Kong pitch amusics also show enhanced pitch abilities relative to their Canadian counterparts. These findings not only provide critical evidence for a double association of music and speech, but also argue for the reconceptualization of communicative disorders within a cultural framework. Along with recent studies documenting cultural differences in visual perception, our auditory evidence challenges the common assumption of universality of basic mental processes and speaks to the domain generality of culture-to-perception influences. 相似文献

19.

Reconstructing speech from human auditory cortex

Pasley BN David SV Mesgarani N Flinker A Shamma SA Crone NE Knight RT Chang EF 《PLoS biology》2012,10(1):e1001251

How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex. 相似文献

20.

High Gamma Oscillations in Medial Temporal Lobe during Overt Production of Speech and Gestures

Lars Marstaller Hana Burianová Paul F. Sowman 《PloS one》2014,9(10)

The study of the production of co-speech gestures (CSGs), i.e., meaningful hand movements that often accompany speech during everyday discourse, provides an important opportunity to investigate the integration of language, action, and memory because of the semantic overlap between gesture movements and speech content. Behavioral studies of CSGs and speech suggest that they have a common base in memory and predict that overt production of both speech and CSGs would be preceded by neural activity related to memory processes. However, to date the neural correlates and timing of CSG production are still largely unknown. In the current study, we addressed these questions with magnetoencephalography and a semantic association paradigm in which participants overtly produced speech or gesture responses that were either meaningfully related to a stimulus or not. Using spectral and beamforming analyses to investigate the neural activity preceding the responses, we found a desynchronization in the beta band (15–25 Hz), which originated 900 ms prior to the onset of speech and was localized to motor and somatosensory regions in the cortex and cerebellum, as well as right inferior frontal gyrus. Beta desynchronization is often seen as an indicator of motor processing and thus reflects motor activity related to the hand movements that gestures add to speech. Furthermore, our results show oscillations in the high gamma band (50–90 Hz), which originated 400 ms prior to speech onset and were localized to the left medial temporal lobe. High gamma oscillations have previously been found to be involved in memory processes and we thus interpret them to be related to contextual association of semantic information in memory. The results of our study show that high gamma oscillations in medial temporal cortex play an important role in the binding of information in human memory during speech and CSG production. 相似文献