首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Sayles M  Winter IM 《Neuron》2008,58(5):789-801
Accurate neural coding of the pitch of complex sounds is an essential part of auditory scene analysis; differences in pitch help segregate concurrent sounds, while similarities in pitch can help group sounds from a common source. In quiet, nonreverberant backgrounds, pitch can be derived from timing information in broadband high-frequency auditory channels and/or from frequency and timing information carried in narrowband low-frequency auditory channels. Recording from single neurons in the cochlear nucleus of anesthetized guinea pigs, we show that the neural representation of pitch based on timing information is severely degraded in the presence of reverberation. This degradation increases with both increasing reverberation strength and channel bandwidth. In a parallel human psychophysical pitch-discrimination task, reverberation impaired the ability to distinguish a high-pass harmonic sound from noise. Together, these findings explain the origin of perceptual difficulties experienced by both normal-hearing and hearing-impaired listeners in reverberant spaces.  相似文献   

2.
By formulating Helmholtz's ideas about perception, in terms of modern-day theories, one arrives at a model of perceptual inference and learning that can explain a remarkable range of neurobiological facts: using constructs from statistical physics, the problems of inferring the causes of sensory input and learning the causal structure of their generation can be resolved using exactly the same principles. Furthermore, inference and learning can proceed in a biologically plausible fashion. The ensuing scheme rests on Empirical Bayes and hierarchical models of how sensory input is caused. The use of hierarchical models enables the brain to construct prior expectations in a dynamic and context-sensitive fashion. This scheme provides a principled way to understand many aspects of cortical organisation and responses. In this paper, we show these perceptual processes are just one aspect of emergent behaviours of systems that conform to a free energy principle. The free energy considered here measures the difference between the probability distribution of environmental quantities that act on the system and an arbitrary distribution encoded by its configuration. The system can minimise free energy by changing its configuration to affect the way it samples the environment or change the distribution it encodes. These changes correspond to action and perception respectively and lead to an adaptive exchange with the environment that is characteristic of biological systems. This treatment assumes that the system's state and structure encode an implicit and probabilistic model of the environment. We will look at the models entailed by the brain and how minimisation of its free energy can explain its dynamics and structure.  相似文献   

3.
Pitch is one of the most important features of natural sounds, underlying the perception of melody in music and prosody in speech. However, the temporal dynamics of pitch processing are still poorly understood. Previous studies suggest that the auditory system uses a wide range of time scales to integrate pitch-related information and that the effective integration time is both task- and stimulus-dependent. None of the existing models of pitch processing can account for such task- and stimulus-dependent variations in processing time scales. This study presents an idealized neurocomputational model, which provides a unified account of the multiple time scales observed in pitch perception. The model is evaluated using a range of perceptual studies, which have not previously been accounted for by a single model, and new results from a neurophysiological experiment. In contrast to other approaches, the current model contains a hierarchy of integration stages and uses feedback to adapt the effective time scales of processing at each stage in response to changes in the input stimulus. The model has features in common with a hierarchical generative process and suggests a key role for efferent connections from central to sub-cortical areas in controlling the temporal dynamics of pitch processing.  相似文献   

4.
Pitch perception is important for understanding speech prosody, music perception, recognizing tones in tonal languages, and perceiving speech in noisy environments. The two principal pitch perception theories consider the place of maximum neural excitation along the auditory nerve and the temporal pattern of the auditory neurons’ action potentials (spikes) as pitch cues. This paper describes a biophysical mechanism by which fine-structure temporal information can be extracted from the spikes generated at the auditory periphery. Deriving meaningful pitch-related information from spike times requires neural structures specialized in capturing synchronous or correlated activity from amongst neural events. The emergence of such pitch-processing neural mechanisms is described through a computational model of auditory processing. Simulation results show that a correlation-based, unsupervised, spike-based form of Hebbian learning can explain the development of neural structures required for recognizing the pitch of simple and complex tones, with or without the fundamental frequency. The temporal code is robust to variations in the spectral shape of the signal and thus can explain the phenomenon of pitch constancy.  相似文献   

5.
When studying animal perception, one normally has the chance of localizing perceptual events in time, that is via behavioural responses time-locked to the stimuli. With multistable stimuli, however, perceptual changes occur despite stationary stimulation. Here, the challenge is to infer these not directly observable perceptual states indirectly from the behavioural data. This estimation is complicated by the fact that an animal's performance is contaminated by errors. We propose a two-step approach to overcome this difficulty: First, one sets up a generative, stochastic model of the behavioural time series based on the relevant parameters, including the probability of errors. Second, one performs a model-based maximum-likelihood estimation on the data in order to extract the non-observable perceptual state transitions. We illustrate this methodology for data from experiments on perception of bistable apparent motion in pigeons. The observed behavioural time series is analysed and explained by a combination of a Markovian perceptual dynamics with a renewal process that governs the motor response. We propose a hidden Markov model in which non-observable states represent both the perceptual states and the states of the renewal process of the motor dynamics, while the observable states account for overt pecking performance. Showing that this constitutes an appropriate phenomenological model of the time series of observable pecking events, we use it subsequently to obtain an estimate of the internal (and thus covert) perceptual reversals. These may directly correspond to changes in the activity of mutually inhibitory populations of motion selective neurones tuned to orthogonal directions.  相似文献   

6.
The perception of music depends on many culture-specific factors, but is also constrained by properties of the auditory system. This has been best characterized for those aspects of music that involve pitch. Pitch sequences are heard in terms of relative as well as absolute pitch. Pitch combinations give rise to emergent properties not present in the component notes. In this review we discuss the basic auditory mechanisms contributing to these and other perceptual effects in music.  相似文献   

7.
The question of which strategy is employed in human decision making has been studied extensively in the context of cognitive tasks; however, this question has not been investigated systematically in the context of perceptual tasks. The goal of this study was to gain insight into the decision-making strategy used by human observers in a low-level perceptual task. Data from more than 100 individuals who participated in an auditory-visual spatial localization task was evaluated to examine which of three plausible strategies could account for each observer''s behavior the best. This task is very suitable for exploring this question because it involves an implicit inference about whether the auditory and visual stimuli were caused by the same object or independent objects, and provides different strategies of how using the inference about causes can lead to distinctly different spatial estimates and response patterns. For example, employing the commonly used cost function of minimizing the mean squared error of spatial estimates would result in a weighted averaging of estimates corresponding to different causal structures. A strategy that would minimize the error in the inferred causal structure would result in the selection of the most likely causal structure and sticking with it in the subsequent inference of location—“model selection.” A third strategy is one that selects a causal structure in proportion to its probability, thus attempting to match the probability of the inferred causal structure. This type of probability matching strategy has been reported to be used by participants predominantly in cognitive tasks. Comparing these three strategies, the behavior of the vast majority of observers in this perceptual task was most consistent with probability matching. While this appears to be a suboptimal strategy and hence a surprising choice for the perceptual system to adopt, we discuss potential advantages of such a strategy for perception.  相似文献   

8.
Speech perception at the interface of neurobiology and linguistics   总被引:2,自引:0,他引:2  
Speech perception consists of a set of computations that take continuously varying acoustic waveforms as input and generate discrete representations that make contact with the lexical representations stored in long-term memory as output. Because the perceptual objects that are recognized by the speech perception enter into subsequent linguistic computation, the format that is used for lexical representation and processing fundamentally constrains the speech perceptual processes. Consequently, theories of speech perception must, at some level, be tightly linked to theories of lexical representation. Minimally, speech perception must yield representations that smoothly and rapidly interface with stored lexical items. Adopting the perspective of Marr, we argue and provide neurobiological and psychophysical evidence for the following research programme. First, at the implementational level, speech perception is a multi-time resolution process, with perceptual analyses occurring concurrently on at least two time scales (approx. 20-80 ms, approx. 150-300 ms), commensurate with (sub)segmental and syllabic analyses, respectively. Second, at the algorithmic level, we suggest that perception proceeds on the basis of internal forward models, or uses an 'analysis-by-synthesis' approach. Third, at the computational level (in the sense of Marr), the theory of lexical representation that we adopt is principally informed by phonological research and assumes that words are represented in the mental lexicon in terms of sequences of discrete segments composed of distinctive features. One important goal of the research programme is to develop linking hypotheses between putative neurobiological primitives (e.g. temporal primitives) and those primitives derived from linguistic inquiry, to arrive ultimately at a biologically sensible and theoretically satisfying model of representation and computation in speech.  相似文献   

9.
Language and music epitomize the complex representational and computational capacities of the human mind. Strikingly similar in their structural and expressive features, a longstanding question is whether the perceptual and cognitive mechanisms underlying these abilities are shared or distinct – either from each other or from other mental processes. One prominent feature shared between language and music is signal encoding using pitch, conveying pragmatics and semantics in language and melody in music. We investigated how pitch processing is shared between language and music by measuring consistency in individual differences in pitch perception across language, music, and three control conditions intended to assess basic sensory and domain-general cognitive processes. Individuals’ pitch perception abilities in language and music were most strongly related, even after accounting for performance in all control conditions. These results provide behavioral evidence, based on patterns of individual differences, that is consistent with the hypothesis that cognitive mechanisms for pitch processing may be shared between language and music.  相似文献   

10.
In wave-type weakly electric fish, two distinct types of primary afferent fibers are specialized for separately encoding modulations in the amplitude and phase (timing) of electrosensory stimuli. Time-coding afferents phase lock to periodic stimuli and respond to changes in stimulus phase with shifts in spike timing. Amplitude-coding afferents fire sporadically to periodic stimuli. Their probability of firing in a given cycle, and therefore their firing rate, is proportional to stimulus amplitude. However, the spike times of time-coding afferents are also affected by changes in amplitude; similarly, the firing rates of amplitude-coding afferents are also affected by changes in phase. Because identical changes in the activity of an individual primary afferent can be caused by modulations in either the amplitude or phase of stimuli, there is ambiguity regarding the information content of primary afferent responses that can result in ‘phantom’ modulations not present in an actual stimulus. Central electrosensory neurons in the hindbrain and midbrain respond to these phantom modulations. Phantom modulations can also elicit behavioral responses, indicating that ambiguity in the encoding of amplitude and timing information ultimately distorts electrosensory perception. A lack of independence in the encoding of multiple stimulus attributes can therefore result in perceptual illusions. Similar effects may occur in other sensory systems as well. In particular, the vertebrate auditory system is thought to be phylogenetically related to the electrosensory system and it encodes information about amplitude and timing in similar ways. It has been well established that pitch perception and loudness perception are both affected by the frequency and intensity of sounds, raising the intriguing possibility that auditory perception may also be affected by ambiguity in the encoding of sound amplitude and timing.  相似文献   

11.
A number of accounts of human auditory perception assume that listeners use prior stimulus context to generate predictions about future stimulation. Here, we tested an auditory pitch-motion hypothesis that was developed from this perspective. Listeners judged either the time change (i.e., duration) or pitch change of a comparison frequency glide relative to a standard (referent) glide. Under a constant-velocity assumption, listeners were hypothesized to use the pitch velocity (Δf/Δt) of the standard glide to generate predictions about the pitch velocity of the comparison glide, leading to perceptual distortions along the to-be-judged dimension when the velocities of the two glides differed. These predictions were borne out in the pattern of relative points of subjective equality by a significant three-way interaction between the velocities of the two glides and task. In general, listeners’ judgments along the task-relevant dimension (pitch or time) were affected by expectations generated by the constant-velocity standard, but in an opposite manner for the two stimulus dimensions. When the comparison glide velocity was faster than the standard, listeners overestimated time change, but underestimated pitch change, whereas when the comparison glide velocity was slower than the standard, listeners underestimated time change, but overestimated pitch change. Perceptual distortions were least evident when the velocities of the standard and comparison glides were matched. Fits of an imputed velocity model further revealed increasingly larger distortions at faster velocities. The present findings provide support for the auditory pitch-motion hypothesis and add to a larger body of work revealing a role for active prediction in human auditory perception.  相似文献   

12.
The past 30 years has seen a remarkable development in our understanding of how the auditory system--particularly the peripheral system--processes complex sounds. Perhaps the most significant has been our understanding of the mechanisms underlying auditory frequency selectivity and their importance for normal and impaired auditory processing. Physiologically vulnerable cochlear filtering can account for many aspects of our normal and impaired psychophysical frequency selectivity with important consequences for the perception of complex sounds. For normal hearing, remarkable mechanisms in the organ of Corti, involving enhancement of mechanical tuning (in mammals probably by feedback of electro-mechanically generated energy from the hair cells), produce exquisite tuning, reflected in the tuning properties of cochlear nerve fibres. Recent comparisons of physiological (cochlear nerve) and psychophysical frequency selectivity in the same species indicate that the ear's overall frequency selectivity can be accounted for by this cochlear filtering, at least in bandwidth terms. Because this cochlear filtering is physiologically vulnerable, it deteriorates in deleterious conditions of the cochlea--hypoxia, disease, drugs, noise overexposure, mechanical disturbance--and is reflected in impaired psychophysical frequency selectivity. This is a fundamental feature of sensorineural hearing loss of cochlear origin, and is of diagnostic value. This cochlear filtering, particularly as reflected in the temporal patterns of cochlear fibres to complex sounds, is remarkably robust over a wide range of stimulus levels. Furthermore, cochlear filtering properties are a prime determinant of the 'place' and 'time' coding of frequency at the cochlear nerve level, both of which appear to be involved in pitch perception. The problem of how the place and time coding of complex sounds is effected over the ear's remarkably wide dynamic range is briefly addressed. In the auditory brainstem, particularly the dorsal cochlear nucleus, are inhibitory mechanisms responsible for enhancing the spectral and temporal contrasts in complex sounds. These mechanisms are now being dissected neuropharmacologically. At the cortical level, mechanisms are evident that are capable of abstracting biologically relevant features of complex sounds. Fundamental studies of how the auditory system encodes and processes complex sounds are vital to promising recent applications in the diagnosis and rehabilitation of the hearing impaired.  相似文献   

13.
Wile D  Balaban E 《PloS one》2007,2(4):e369
Current theories of auditory pitch perception propose that cochlear place (spectral) and activity timing pattern (temporal) information are somehow combined within the brain to produce holistic pitch percepts, yet the neural mechanisms for integrating these two kinds of information remain obscure. To examine this process in more detail, stimuli made up of three pure tones whose components are individually resolved by the peripheral auditory system, but that nonetheless elicit a holistic, "missing fundamental" pitch percept, were played to human listeners. A technique was used to separate neural timing activity related to individual components of the tone complexes from timing activity related to an emergent feature of the complex (the envelope), and the region of the tonotopic map where information could originate from was simultaneously restricted by masking noise. Pitch percepts were mirrored to a very high degree by a simple combination of component-related and envelope-related neural responses with similar timing that originate within higher-frequency regions of the tonotopic map where stimulus components interact. These results suggest a coding scheme for holistic pitches whereby limited regions of the tonotopic map (spectral places) carrying envelope- and component-related activity with similar timing patterns selectively provide a key source of neural pitch information. A similar mechanism of integration between local and emergent object properties may contribute to holistic percepts in a variety of sensory systems.  相似文献   

14.
This work analyzed the perceptual attributes of natural dynamic audiovisual scenes. We presented thirty participants with 19 natural scenes in a similarity categorization task, followed by a semi-structured interview. The scenes were reproduced with an immersive audiovisual display. Natural scene perception has been studied mainly with unimodal settings, which have identified motion as one of the most salient attributes related to visual scenes, and sound intensity along with pitch trajectories related to auditory scenes. However, controlled laboratory experiments with natural multimodal stimuli are still scarce. Our results show that humans pay attention to similar perceptual attributes in natural scenes, and a two-dimensional perceptual map of the stimulus scenes and perceptual attributes was obtained in this work. The exploratory results show the amount of movement, perceived noisiness, and eventfulness of the scene to be the most important perceptual attributes in naturalistically reproduced real-world urban environments. We found the scene gist properties openness and expansion to remain as important factors in scenes with no salient auditory or visual events. We propose that the study of scene perception should move forward to understand better the processes behind multimodal scene processing in real-world environments. We publish our stimulus scenes as spherical video recordings and sound field recordings in a publicly available database.  相似文献   

15.
The internal representation of solid shape with respect to vision   总被引:11,自引:0,他引:11  
It is argued that the internal model of any object must take the form of a function, such that for any intended action the resulting reafference is predictable. This function can be derived explicitly for the case of visual perception of rigid bodies by ambulant observers. The function depends on physical causation, not physiology; consequently, one can make a priori statements about possible internal models. A posteriori it seems likely that the orientation sensitive units described by Hubel and Wiesel constitute a physiological substrate subserving the extraction of the invariants of this function. The function is used to define a measure for the visual complexity of solid shape. Relations with Gestalt theories of perception are discussed.  相似文献   

16.
Children using unilateral cochlear implants abnormally rely on tempo rather than mode cues to distinguish whether a musical piece is happy or sad. This led us to question how this judgment is affected by the type of experience in early auditory development. We hypothesized that judgments of the emotional content of music would vary by the type and duration of access to sound in early life due to deafness, altered perception of musical cues through new ways of using auditory prostheses bilaterally, and formal music training during childhood. Seventy-five participants completed the Montreal Emotion Identification Test. Thirty-three had normal hearing (aged 6.6 to 40.0 years) and 42 children had hearing loss and used bilateral auditory prostheses (31 bilaterally implanted and 11 unilaterally implanted with contralateral hearing aid use). Reaction time and accuracy were measured. Accurate judgment of emotion in music was achieved across ages and musical experience. Musical training accentuated the reliance on mode cues which developed with age in the normal hearing group. Degrading pitch cues through cochlear implant-mediated hearing induced greater reliance on tempo cues, but mode cues grew in salience when at least partial acoustic information was available through some residual hearing in the contralateral ear. Finally, when pitch cues were experimentally distorted to represent cochlear implant hearing, individuals with normal hearing (including those with musical training) switched to an abnormal dependence on tempo cues. The data indicate that, in a western culture, access to acoustic hearing in early life promotes a preference for mode rather than tempo cues which is enhanced by musical training. The challenge to these preferred strategies during cochlear implant hearing (simulated and real), regardless of musical training, suggests that access to pitch cues for children with hearing loss must be improved by preservation of residual hearing and improvements in cochlear implant technology.  相似文献   

17.
The process of recognition or isolation of one or several entities from among many possible entities is termed intellego perception. It is shown that not only are many of our everyday percepts of this type, but perception of microscopic events using the methods of quantum mechanics are also intellego in nature. Information theory seems to be a natural language in which to express perceptual activity of this type. It is argued that the biological organism quantifies its sensations using an information theoretical measure. This, in turn, sets the stage for a mathematical theory of sensory perception.  相似文献   

18.
Pitch and timbre perception are both based on the frequency content of sound, but previous perceptual experiments have disagreed about whether these two dimensions are processed independently from each other. We tested the interaction of pitch and timbre variations using sequential comparisons of sound pairs. Listeners judged whether two sequential sounds were identical along the dimension of either pitch or timbre, while the perceptual distances along both dimensions were parametrically manipulated. Pitch and timbre variations perceptually interfered with each other and the degree of interference was modulated by the magnitude of changes along the un-attended dimension. These results show that pitch and timbre are not orthogonal to each other when both are assessed with parametrically controlled variations.  相似文献   

19.
The present study investigated whether and to which extent temporal integration in bats is influenced by echolocation behavior. One way to quantify temporal integration is to measure the detection threshold for a pair of short tone pips as a function of the temporal separation between the pips. To asses the effect of preceding sonar emission on temporal integration in the bat, Megaderma lyra, the detection thresholds of identical subjects were measured in a passive as well as in an active paradigm. In the passive paradigm, the presentation of the pip pairs was independent of the bats' sonar emissions; in the active paradigm, the presentation was triggered by the bats' sonar emissions. In both cases, the bats showed a very short integration time in the range of 100-200 micros. Moreover, the comparison of the active and passive results within each bat revealed no systematic differences in the two measuring paradigms. These results indicate that temporal integration is not influenced by echolocation. Simulations with a computer model of cochlear filtering based on measurements of M. lyra cochlear tuning suggest that the perceptual temporal integration is dominated by the integration of the cochlear filters.  相似文献   

20.
Human psychoacoustical studies have been the main sources of information from which the brain mechanisms of sound localization are inferred. The value of animal models would be limited, if humans and the animals did not share the same perceptual experience and the neural mechanisms for it. Barn owls and humans use the same method of computing interaural time differences for localization in the horizontal plane. The behavioral performance of owls and its neural bases are consistent with some of the theories developed for human sound localization. Neural theories of sound localization largely owe their origin to the study of sound localization by humans, even though little is known about the physiological properties of the human auditory system. One of these ideas is binaural cross-correlation which assumes that the human brain performs a process similar to mathematical cross-correlation to measure the interaural time difference for localization in the horizontal plane. The most complete set of neural evidence for this theory comes from the study of sound localization and its brain mechanisms in barn owls, although partial support is also available from studies on laboratory mammals. Animal models of human sensory perception make two implicit assumptions; animals and humans experience the same percept and the same neural mechanism underlies the creation of the percept. These assumptions are hard to prove for obvious reason. This article reviews several lines of evidence that similar neural mechanisms must underlie the perception of sound locations in humans and owls.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号