首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In mammalian auditory cortex, sound source position is represented by a population of broadly tuned neurons whose firing is modulated by sounds located at all positions surrounding the animal. Peaks of their tuning curves are concentrated at lateral position, while their slopes are steepest at the interaural midline, allowing for the maximum localization accuracy in that area. These experimental observations contradict initial assumptions that the auditory space is represented as a topographic cortical map. It has been suggested that a “panoramic” code has evolved to match specific demands of the sound localization task. This work provides evidence suggesting that properties of spatial auditory neurons identified experimentally follow from a general design principle- learning a sparse, efficient representation of natural stimuli. Natural binaural sounds were recorded and served as input to a hierarchical sparse-coding model. In the first layer, left and right ear sounds were separately encoded by a population of complex-valued basis functions which separated phase and amplitude. Both parameters are known to carry information relevant for spatial hearing. Monaural input converged in the second layer, which learned a joint representation of amplitude and interaural phase difference. Spatial selectivity of each second-layer unit was measured by exposing the model to natural sound sources recorded at different positions. Obtained tuning curves match well tuning characteristics of neurons in the mammalian auditory cortex. This study connects neuronal coding of the auditory space with natural stimulus statistics and generates new experimental predictions. Moreover, results presented here suggest that cortical regions with seemingly different functions may implement the same computational strategy-efficient coding.  相似文献   

2.
How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.  相似文献   

3.
We have developed a sparse mathematical representation of speech that minimizes the number of active model neurons needed to represent typical speech sounds. The model learns several well-known acoustic features of speech such as harmonic stacks, formants, onsets and terminations, but we also find more exotic structures in the spectrogram representation of sound such as localized checkerboard patterns and frequency-modulated excitatory subregions flanked by suppressive sidebands. Moreover, several of these novel features resemble neuronal receptive fields reported in the Inferior Colliculus (IC), as well as auditory thalamus and cortex, and our model neurons exhibit the same tradeoff in spectrotemporal resolution as has been observed in IC. To our knowledge, this is the first demonstration that receptive fields of neurons in the ascending mammalian auditory pathway beyond the auditory nerve can be predicted based on coding principles and the statistical properties of recorded sounds.  相似文献   

4.
Sounds in our environment like voices, animal calls or musical instruments are easily recognized by human listeners. Understanding the key features underlying this robust sound recognition is an important question in auditory science. Here, we studied the recognition by human listeners of new classes of sounds: acoustic and auditory sketches, sounds that are severely impoverished but still recognizable. Starting from a time-frequency representation, a sketch is obtained by keeping only sparse elements of the original signal, here, by means of a simple peak-picking algorithm. Two time-frequency representations were compared: a biologically grounded one, the auditory spectrogram, which simulates peripheral auditory filtering, and a simple acoustic spectrogram, based on a Fourier transform. Three degrees of sparsity were also investigated. Listeners were asked to recognize the category to which a sketch sound belongs: singing voices, bird calls, musical instruments, and vehicle engine noises. Results showed that, with the exception of voice sounds, very sparse representations of sounds (10 features, or energy peaks, per second) could be recognized above chance. No clear differences could be observed between the acoustic and the auditory sketches. For the voice sounds, however, a completely different pattern of results emerged, with at-chance or even below-chance recognition performances, suggesting that the important features of the voice, whatever they are, were removed by the sketch process. Overall, these perceptual results were well correlated with a model of auditory distances, based on spectro-temporal excitation patterns (STEPs). This study confirms the potential of these new classes of sounds, acoustic and auditory sketches, to study sound recognition.  相似文献   

5.
Given the extraordinary ability of humans and animals to recognize communication signals over a background of noise, describing noise invariant neural responses is critical not only to pinpoint the brain regions that are mediating our robust perceptions but also to understand the neural computations that are performing these tasks and the underlying circuitry. Although invariant neural responses, such as rotation-invariant face cells, are well described in the visual system, high-level auditory neurons that can represent the same behaviorally relevant signal in a range of listening conditions have yet to be discovered. Here we found neurons in a secondary area of the avian auditory cortex that exhibit noise-invariant responses in the sense that they responded with similar spike patterns to song stimuli presented in silence and over a background of naturalistic noise. By characterizing the neurons'' tuning in terms of their responses to modulations in the temporal and spectral envelope of the sound, we then show that noise invariance is partly achieved by selectively responding to long sounds with sharp spectral structure. Finally, to demonstrate that such computations could explain noise invariance, we designed a biologically inspired noise-filtering algorithm that can be used to separate song or speech from noise. This novel noise-filtering method performs as well as other state-of-the-art de-noising algorithms and could be used in clinical or consumer oriented applications. Our biologically inspired model also shows how high-level noise-invariant responses could be created from neural responses typically found in primary auditory cortex.  相似文献   

6.
Speech is the most interesting and one of the most complex sounds dealt with by the auditory system. The neural representation of speech needs to capture those features of the signal on which the brain depends in language communication. Here we describe the representation of speech in the auditory nerve and in a few sites in the central nervous system from the perspective of the neural coding of important aspects of the signal. The representation is tonotopic, meaning that the speech signal is decomposed by frequency and different frequency components are represented in different populations of neurons. Essential to the representation are the properties of frequency tuning and nonlinear suppression. Tuning creates the decomposition of the signal by frequency, and nonlinear suppression is essential for maintaining the representation across sound levels. The representation changes in central auditory neurons by becoming more robust against changes in stimulus intensity and more transient. However, it is probable that the form of the representation at the auditory cortex is fundamentally different from that at lower levels, in that stimulus features other than the distribution of energy across frequency are analysed.  相似文献   

7.
Songbirds are one of the few groups of animals that learn the sounds used for vocal communication during development. Like humans, songbirds memorize vocal sounds based on auditory experience with vocalizations of adult “tutors”, and then use auditory feedback of self-produced vocalizations to gradually match their motor output to the memory of tutor sounds. In humans, investigations of early vocal learning have focused mainly on perceptual skills of infants, whereas studies of songbirds have focused on measures of vocal production. In order to fully exploit songbirds as a model for human speech, understand the neural basis of learned vocal behavior, and investigate links between vocal perception and production, studies of songbirds must examine both behavioral measures of perception and neural measures of discrimination during development. Here we used behavioral and electrophysiological assays of the ability of songbirds to distinguish vocal calls of varying frequencies at different stages of vocal learning. The results show that neural tuning in auditory cortex mirrors behavioral improvements in the ability to make perceptual distinctions of vocal calls as birds are engaged in vocal learning. Thus, separate measures of neural discrimination and behavioral perception yielded highly similar trends during the course of vocal development. The timing of this improvement in the ability to distinguish vocal sounds correlates with our previous work showing substantial refinement of axonal connectivity in cortico-basal ganglia pathways necessary for vocal learning.  相似文献   

8.
The present article outlines the contribution of the mismatch negativity (MMN), and its magnetic equivalent MMNm, to our understanding of the perception of speech sounds in the human brain. MMN data indicate that each sound, both speech and non-speech, develops its neural representation corresponding to the percept of this sound in the neurophysiological substrate of auditory sensory memory. The accuracy of this representation, determining the accuracy of the discrimination between different sounds, can be probed with MMN separately for any auditory feature or stimulus type such as phonemes. Furthermore, MMN data show that the perception of phonemes, and probably also of larger linguistic units (syllables and words), is based on language-specific phonetic traces developed in the posterior part of the left-hemisphere auditory cortex. These traces serve as recognition models for the corresponding speech sounds in listening to speech.  相似文献   

9.
Spectro-temporal properties of auditory cortex neurons have been extensively studied with artificial sounds but it is still unclear whether they help in understanding neuronal responses to communication sounds. Here, we directly compared spectro-temporal receptive fields (STRFs) obtained from the same neurons using both artificial stimuli (dynamic moving ripples, DMRs) and natural stimuli (conspecific vocalizations) that were matched in terms of spectral content, average power and modulation spectrum. On a population of auditory cortex neurons exhibiting reliable tuning curves when tested with pure tones, significant STRFs were obtained for 62% of the cells with vocalizations and 68% with DMR. However, for many cells with significant vocalization-derived STRFs (STRFvoc) and DMR-derived STRFs (STRFdmr), the BF, latency, bandwidth and global STRFs shape differed more than what would be predicted by spiking responses simulated by a linear model based on a non-homogenous Poisson process. Moreover STRFvoc predicted neural responses to vocalizations more accurately than STRFdmr predicted neural response to DMRs, despite similar spike-timing reliability for both sets of stimuli. Cortical bursts, which potentially introduce nonlinearities in evoked responses, did not explain the differences between STRFvoc and STRFdmr. Altogether, these results suggest that the nonlinearity of auditory cortical responses makes it difficult to predict responses to communication sounds from STRFs computed from artificial stimuli.  相似文献   

10.
Auditory cortex: comparative aspects of maps and plasticity.   总被引:3,自引:0,他引:3  
Much recent work in the field of auditory cortex analysis consists of an intensified search for complex sound representation and sound localization mechanisms using tonotopic maps as a frame of reference. Mammalian species rely on parallel processing in multiple tonotopic and non-tonotopic maps but show different degrees of unit complexity, and orderly representation of acoustic dimensions in such maps depending on the predictability of sounds in their environment. Birds appear to rely chiefly on one tonotopic map which harbours multidimensional complex representations. During development and after partial hearing loss, tonotopic organization changes in a predictable manner. Learning also modifies the spatial representation of sounds and even modifies tonotopic organization, but the spatial rules involved in this process have not yet emerged.  相似文献   

11.
Functional neuroimaging research provides detailed observations of the response patterns that natural sounds (e.g. human voices and speech, animal cries, environmental sounds) evoke in the human brain. The computational and representational mechanisms underlying these observations, however, remain largely unknown. Here we combine high spatial resolution (3 and 7 Tesla) functional magnetic resonance imaging (fMRI) with computational modeling to reveal how natural sounds are represented in the human brain. We compare competing models of sound representations and select the model that most accurately predicts fMRI response patterns to natural sounds. Our results show that the cortical encoding of natural sounds entails the formation of multiple representations of sound spectrograms with different degrees of spectral and temporal resolution. The cortex derives these multi-resolution representations through frequency-specific neural processing channels and through the combined analysis of the spectral and temporal modulations in the spectrogram. Furthermore, our findings suggest that a spectral-temporal resolution trade-off may govern the modulation tuning of neuronal populations throughout the auditory cortex. Specifically, our fMRI results suggest that neuronal populations in posterior/dorsal auditory regions preferably encode coarse spectral information with high temporal precision. Vice-versa, neuronal populations in anterior/ventral auditory regions preferably encode fine-grained spectral information with low temporal precision. We propose that such a multi-resolution analysis may be crucially relevant for flexible and behaviorally-relevant sound processing and may constitute one of the computational underpinnings of functional specialization in auditory cortex.  相似文献   

12.
McDermott JH  Simoncelli EP 《Neuron》2011,71(5):926-940
Rainstorms, insect swarms, and galloping horses produce "sound textures"--the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures; however, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation.  相似文献   

13.
Why is spatial tuning in auditory cortex weak, even though location is important to object recognition in natural settings? This question continues to vex neuroscientists focused on linking physiological results to auditory perception. Here we show that the spatial locations of simultaneous, competing sound sources dramatically influence how well neural spike trains recorded from the zebra finch field L (an analog of mammalian primary auditory cortex) encode source identity. We find that the location of a birdsong played in quiet has little effect on the fidelity of the neural encoding of the song. However, when the song is presented along with a masker, spatial effects are pronounced. For each spatial configuration, a subset of neurons encodes song identity more robustly than others. As a result, competing sources from different locations dominate responses of different neural subpopulations, helping to separate neural responses into independent representations. These results help elucidate how cortical processing exploits spatial information to provide a substrate for selective spatial auditory attention.  相似文献   

14.
Female choice plays a critical role in the evolution of male acoustic displays. Yet there is limited information on the neurophysiological basis of female songbirds’ auditory recognition systems. To understand the neural mechanisms of how non-singing female songbirds perceive behaviorally relevant vocalizations, we recorded responses of single neurons to acoustic stimuli in two auditory forebrain regions, the caudal lateral mesopallium (CLM) and Field L, in anesthetized adult female zebra finches (Taeniopygia guttata). Using various metrics of response selectivity, we found consistently higher response strengths for unfamiliar conspecific songs compared to tone pips and white noise in Field L but not in CLM. We also found that neurons in the left auditory forebrain had lower response strengths to synthetics sounds, leading to overall higher neural selectivity for song in neurons of the left hemisphere. This laterality effect is consistent with previously published behavioral data in zebra finches. Overall, our results from Field L are in parallel and from CLM are in contrast with the patterns of response selectivity reported for conspecific songs over synthetic sounds in male zebra finches, suggesting some degree of sexual dimorphism of auditory perception mechanisms in songbirds.  相似文献   

15.
The process through which young male songbirds learn the characteristics of the songs of an adult male of their own species has strong similarities with speech acquisition in human infants. Both involve two phases: a period of auditory memorization followed by a period during which the individual develops its own vocalizations. The avian 'song system', a network of brain nuclei, is the probable neural substrate for the second phase of sensorimotor learning. By contrast, the neural representation of song memory acquired in the first phase is localized outside the song system, in different regions of the avian equivalent of the human auditory association cortex.  相似文献   

16.
Sparse representation of sounds in the unanesthetized auditory cortex   总被引:2,自引:0,他引:2  
How do neuronal populations in the auditory cortex represent acoustic stimuli? Although sound-evoked neural responses in the anesthetized auditory cortex are mainly transient, recent experiments in the unanesthetized preparation have emphasized subpopulations with other response properties. To quantify the relative contributions of these different subpopulations in the awake preparation, we have estimated the representation of sounds across the neuronal population using a representative ensemble of stimuli. We used cell-attached recording with a glass electrode, a method for which single-unit isolation does not depend on neuronal activity, to quantify the fraction of neurons engaged by acoustic stimuli (tones, frequency modulated sweeps, white-noise bursts, and natural stimuli) in the primary auditory cortex of awake head-fixed rats. We find that the population response is sparse, with stimuli typically eliciting high firing rates (>20 spikes/second) in less than 5% of neurons at any instant. Some neurons had very low spontaneous firing rates (<0.01 spikes/second). At the other extreme, some neurons had driven rates in excess of 50 spikes/second. Interestingly, the overall population response was well described by a lognormal distribution, rather than the exponential distribution that is often reported. Our results represent, to our knowledge, the first quantitative evidence for sparse representations of sounds in the unanesthetized auditory cortex. Our results are compatible with a model in which most neurons are silent much of the time, and in which representations are composed of small dynamic subsets of highly active neurons.  相似文献   

17.

Background

Singing in songbirds is a complex, learned behavior which shares many parallels with human speech. The avian vocal organ (syrinx) has two potential sound sources, and each sound generator is under unilateral, ipsilateral neural control. Different songbird species vary in their use of bilateral or unilateral phonation (lateralized sound production) and rapid switching between left and right sound generation (interhemispheric switching of motor control). Bengalese finches (Lonchura striata domestica) have received considerable attention, because they rapidly modify their song in response to manipulations of auditory feedback. However, how the left and right sides of the syrinx contribute to acoustic control of song has not been studied.

Methodology

Three manipulations of lateralized syringeal control of sound production were conducted. First, unilateral syringeal muscular control was eliminated by resection of the left or right tracheosyringeal portion of the hypoglossal nerve, which provides neuromuscular innervation of the syrinx. Spectral and temporal features of song were compared before and after lateralized nerve injury. In a second experiment, either the left or right sound source was devoiced to confirm the role of each sound generator in the control of acoustic phonology. Third, air pressure was recorded before and after unilateral denervation to enable quantification of acoustic change within individual syllables following lateralized nerve resection.

Significance

These experiments demonstrate that the left sound source produces louder, higher frequency, lower entropy sounds, and the right sound generator produces lower amplitude, lower frequency, higher entropy sounds. The bilateral division of labor is complex and the frequency specialization is the opposite pattern observed in most songbirds. Further, there is evidence for rapid interhemispheric switching during song production. Lateralized control of song production in Bengalese finches may enhance acoustic complexity of song and facilitate the rapid modification of sound production following manipulations of auditory feedback.  相似文献   

18.
The zebra finch learns his song by memorizing a tutor's vocalization and then using auditory feedback to match his current vocalization to this memory, or template. The neural song system of adult and young birds responds to auditory stimuli, and exhibits selective tuning to the bird's own song (BOS). We have directly examined the development of neural tuning in the song motor system. We measured song system responses to vocalizations produced at various ages during sleep. We now report that the auditory response of the song motor system and motor output are linked early in song development. During sleep, playback of the current BOS induced a response in the song nucleus HVC during the song practice period, even when the song consisted of little more than repeated begging calls. Halfway through the sensorimotor period when the song was not yet in its final form, the response to BOS already exceeded that to all other auditory stimuli tested. Moreover, responses to previous, plastic versions of BOS decayed over time. This indicates that selective tuning to BOS mirrors the vocalization that the bird is currently producing.  相似文献   

19.
Vocal learning in songbirds and humans occurs by imitation of adult vocalizations. In both groups, vocal learning includes a perceptual phase during which juveniles birds and infants memorize adult vocalizations. Despite intensive research, the neural mechanisms supporting this auditory memory are still poorly understood. The present functional MRI study demonstrates that in adult zebra finches, the right auditory midbrain nucleus responds selectively to the copied vocalizations. The selective signal is distinct from selectivity for the bird''s own song and does not simply reflect acoustic differences between the stimuli. Furthermore, the amplitude of the selective signal is positively correlated with the strength of vocal learning, measured by the amount of song that experimental birds copied from the adult model. These results indicate that early sensory experience can generate a long-lasting memory trace in the auditory midbrain of songbirds that may support song learning.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号