首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Timbre is the attribute of sound that allows humans and other animals to distinguish among different sound sources. Studies based on psychophysical judgments of musical timbre, ecological analyses of sound''s physical characteristics as well as machine learning approaches have all suggested that timbre is a multifaceted attribute that invokes both spectral and temporal sound features. Here, we explored the neural underpinnings of musical timbre. We used a neuro-computational framework based on spectro-temporal receptive fields, recorded from over a thousand neurons in the mammalian primary auditory cortex as well as from simulated cortical neurons, augmented with a nonlinear classifier. The model was able to perform robust instrument classification irrespective of pitch and playing style, with an accuracy of 98.7%. Using the same front end, the model was also able to reproduce perceptual distance judgments between timbres as perceived by human listeners. The study demonstrates that joint spectro-temporal features, such as those observed in the mammalian primary auditory cortex, are critical to provide the rich-enough representation necessary to account for perceptual judgments of timbre by human listeners, as well as recognition of musical instruments.  相似文献   

2.
The perception of music depends on many culture-specific factors, but is also constrained by properties of the auditory system. This has been best characterized for those aspects of music that involve pitch. Pitch sequences are heard in terms of relative as well as absolute pitch. Pitch combinations give rise to emergent properties not present in the component notes. In this review we discuss the basic auditory mechanisms contributing to these and other perceptual effects in music.  相似文献   

3.
Examination of the cortical auditory evoked potentials to complex tones changing in pitch and timbre suggests a useful new method for investigating higher auditory processes, in particular those concerned with `streaming' and auditory object formation. The main conclusions were: (i) the N1 evoked by a sudden change in pitch or timbre was more posteriorly distributed than the N1 at the onset of the tone, indicating at least partial segregation of the neuronal populations responsive to sound onset and spectral change; (ii) the T-complex was consistently larger over the right hemisphere, consistent with clinical and PET evidence for particular involvement of the right temporal lobe in the processing of timbral and musical material; (iii) responses to timbral change were relatively unaffected by increasing the rate of interspersed changes in pitch, suggesting a mechanism for detecting the onset of a new voice in a constantly modulated sound stream; (iv) responses to onset, offset and pitch change of complex tones were relatively unaffected by interfering tones when the latter were of a different timbre, suggesting these responses must be generated subsequent to auditory stream segregation.  相似文献   

4.
A number of accounts of human auditory perception assume that listeners use prior stimulus context to generate predictions about future stimulation. Here, we tested an auditory pitch-motion hypothesis that was developed from this perspective. Listeners judged either the time change (i.e., duration) or pitch change of a comparison frequency glide relative to a standard (referent) glide. Under a constant-velocity assumption, listeners were hypothesized to use the pitch velocity (Δf/Δt) of the standard glide to generate predictions about the pitch velocity of the comparison glide, leading to perceptual distortions along the to-be-judged dimension when the velocities of the two glides differed. These predictions were borne out in the pattern of relative points of subjective equality by a significant three-way interaction between the velocities of the two glides and task. In general, listeners’ judgments along the task-relevant dimension (pitch or time) were affected by expectations generated by the constant-velocity standard, but in an opposite manner for the two stimulus dimensions. When the comparison glide velocity was faster than the standard, listeners overestimated time change, but underestimated pitch change, whereas when the comparison glide velocity was slower than the standard, listeners underestimated time change, but overestimated pitch change. Perceptual distortions were least evident when the velocities of the standard and comparison glides were matched. Fits of an imputed velocity model further revealed increasingly larger distortions at faster velocities. The present findings provide support for the auditory pitch-motion hypothesis and add to a larger body of work revealing a role for active prediction in human auditory perception.  相似文献   

5.
Music processing is influenced by pitch perception and memory. Additionally these features interact, with pitch memory performance decreasing as the perceived distance between two pitches decreases. This study examined whether or not the difficulty of pitch discrimination influences pitch retention by testing individuals with congenital amusia. Pitch discrimination difficulty was equated by determining an individual’s threshold with a two down one up staircase procedure and using this to create conditions where two pitches (the standard and the comparison tones) differed by 1x, 2x, and 3x the threshold setting. For comparison with the literature a condition that employed a constant pitch difference of four semitones was also included. The results showed that pitch memory performance improved as the discrimination between the standard and the comparison tones was made easier for both amusic and control groups, and more importantly, that amusics did not show any pitch retention deficits when the discrimination difficulty was equated. In contrast, consistent with previous literature, amusics performed worse than controls when the physical pitch distance was held constant at four semitones. This impaired performance has been interpreted as evidence for pitch memory impairment in the past. However, employing a constant pitch distance always makes the difference closer to the discrimination threshold for the amusic group than for the control group. Therefore, reduced performance in this condition may simply reflect differences in the perceptual difficulty of the discrimination. The findings indicate the importance of equating the discrimination difficulty when investigating memory.  相似文献   

6.
Timbre and pitch are two independent perceptual qualities of sounds closely related to the spectral envelope and to the fundamental frequency of periodic temporal envelope fluctuations, respectively. To a first approximation, the spectral and temporal tuning properties of neurons in the auditory midbrain of various animals are independent, with layouts of these tuning properties in approximately orthogonal tonotopic and periodotopic maps. For the first time we demonstrate by means of magnetoencephalography a periodotopic organization of the human auditory cortex and analyse its spatial relationship to the tonotopic organization by using a range of stimuli with different temporal envelope fluctuations and spectra and a magnetometer providing high spatial resolution. We demonstrate an orthogonal arrangement of tonotopic and periodotopic gradients. Our results are in line with the organization of such maps in animals and closely match the perceptual orthogonality of timbre and pitch in humans. Accepted: 25 July 1997  相似文献   

7.
This paper introduces a model that accounts quantitatively for a phenomenon of perceptual segregation, the simultaneous perception of more than one pitch in a single complex sound. The method is based on a characterization of the time-varying spike probability generated by a model of cochlear responses to sounds. It demonstrates how the autocorrelation theories of pitch perception contain the necessary elements to define a specific measure in the phase space of the simulated auditory nerve probability of firing time series. This measure was motivated in the first instance by the correlation dimension of the attractor; however, it has been modified in several ways in order to increase the neurobiological plausibility. This quantity characterizes each of the cochlear frequency channels and gives rise to a channel clustering criterion. The model computes the clusters and the pitch estimates simultaneously using the same processing mechanisms of delay lines; therefore, it respects the biological constraints in a similar way to temporal theories of pitch. The model successfully explains a wide range of perceptual experiments.  相似文献   

8.
Pitch is one of the most important features of natural sounds, underlying the perception of melody in music and prosody in speech. However, the temporal dynamics of pitch processing are still poorly understood. Previous studies suggest that the auditory system uses a wide range of time scales to integrate pitch-related information and that the effective integration time is both task- and stimulus-dependent. None of the existing models of pitch processing can account for such task- and stimulus-dependent variations in processing time scales. This study presents an idealized neurocomputational model, which provides a unified account of the multiple time scales observed in pitch perception. The model is evaluated using a range of perceptual studies, which have not previously been accounted for by a single model, and new results from a neurophysiological experiment. In contrast to other approaches, the current model contains a hierarchy of integration stages and uses feedback to adapt the effective time scales of processing at each stage in response to changes in the input stimulus. The model has features in common with a hierarchical generative process and suggests a key role for efferent connections from central to sub-cortical areas in controlling the temporal dynamics of pitch processing.  相似文献   

9.
Recent studies have investigated the structure of perceptual relations among musical instrument timbres by multidimensional scaling (MDS) techniques. These studies have employed both acoustically produced tones and digitally synthesized imitations and hybrids of acoustic instrument tones. The analyses of dissimilarity ratings for all pairs of a set of tones are usually represented as geometrical structures in a two- or three-dimensional Euclidean space in which the shared 'perceptual' axes are shown to have a qualitative correspondence to acoustic properties such as spectral energy distribution, onset characteristics and degree of change in spectral distribution over the duration of the tone. The present study took as a point of departure a MDS analysis for complex, synthetic tones with the aim of testing whether musician and non-musician listeners used the relations defined by the perceptual space to perform an analogies task of the sort: timbre A is to timbre B as timbre C is to which of two possible timbres, D or D'? A parallelogram model was used to select the D timbres: if the relation between A and B is represented as a vector with both magnitude and direction components, then the appropriate D should form a vector with C having similar magnitude and direction in the timbre space. Aside from conceptual difficulties with the task for both non-musicians and composers, choices for both groups provide support for the parallelogram model indicating a capacity in listeners to perceive abstract relations among the timbres of complex sounds without specific training in such a task.  相似文献   

10.
11.
Most perceived parameters of sound (e.g. pitch, duration, timbre) can also be imagined in the absence of sound. These parameters are imagined more veridically by expert musicians than non-experts. Evidence for whether loudness is imagined, however, is conflicting. In music, the question of whether loudness is imagined is particularly relevant due to its role as a principal parameter of performance expression. This study addressed the hypothesis that the veridicality of imagined loudness improves with increasing musical expertise. Experts, novices and non-musicians imagined short passages of well-known classical music under two counterbalanced conditions: 1) while adjusting a slider to indicate imagined loudness of the music and 2) while tapping out the rhythm to indicate imagined timing. Subtests assessed music listening abilities and working memory span to determine whether these factors, also hypothesised to improve with increasing musical expertise, could account for imagery task performance. Similarity between each participant’s imagined and listening loudness profiles and reference recording intensity profiles was assessed using time series analysis and dynamic time warping. The results suggest a widespread ability to imagine the loudness of familiar music. The veridicality of imagined loudness tended to be greatest for the expert musicians, supporting the predicted relationship between musical expertise and musical imagery ability.  相似文献   

12.
Pitch changes that occur in speech and melodies can be described in terms of contour patterns of rises and falls in pitch and the actual pitches at each point in time. This study investigates whether training can improve the perception of these different features. One group of ten adults trained on a pitch-contour discrimination task, a second group trained on an actual-pitch discrimination task, and a third group trained on a contour comparison task between pitch sequences and their visual analogs. A fourth group did not undergo training. It was found that training on pitch sequence comparison tasks gave rise to improvements in pitch-contour perception. This occurred irrespective of whether the training task required the discrimination of contour patterns or the actual pitch details. In contrast, none of the training tasks were found to improve the perception of the actual pitches in a sequence. The results support psychological models of pitch processing where contour processing is an initial step before actual pitch details are analyzed. Further studies are required to determine whether pitch-contour training is effective in improving speech and melody perception.  相似文献   

13.
Pitch perception is important for understanding speech prosody, music perception, recognizing tones in tonal languages, and perceiving speech in noisy environments. The two principal pitch perception theories consider the place of maximum neural excitation along the auditory nerve and the temporal pattern of the auditory neurons’ action potentials (spikes) as pitch cues. This paper describes a biophysical mechanism by which fine-structure temporal information can be extracted from the spikes generated at the auditory periphery. Deriving meaningful pitch-related information from spike times requires neural structures specialized in capturing synchronous or correlated activity from amongst neural events. The emergence of such pitch-processing neural mechanisms is described through a computational model of auditory processing. Simulation results show that a correlation-based, unsupervised, spike-based form of Hebbian learning can explain the development of neural structures required for recognizing the pitch of simple and complex tones, with or without the fundamental frequency. The temporal code is robust to variations in the spectral shape of the signal and thus can explain the phenomenon of pitch constancy.  相似文献   

14.
Gao S  Hu J  Gong D  Chen S  Kendrick KM  Yao D 《PloS one》2012,7(5):e38289
Consonants, unlike vowels, are thought to be speech specific and therefore no interactions would be expected between consonants and pitch, a basic element for musical tones. The present study used an electrophysiological approach to investigate whether, contrary to this view, there is integrative processing of consonants and pitch by measuring additivity of changes in the mismatch negativity (MMN) of evoked potentials. The MMN is elicited by discriminable variations occurring in a sequence of repetitive, homogeneous sounds. In the experiment, event-related potentials (ERPs) were recorded while participants heard frequently sung consonant-vowel syllables and rare stimuli deviating in either consonant identity only, pitch only, or in both dimensions. Every type of deviation elicited a reliable MMN. As expected, the two single-deviant MMNs had similar amplitudes, but that of the double-deviant MMN was also not significantly different from them. This absence of additivity in the double-deviant MMN suggests that consonant and pitch variations are processed, at least at a pre-attentive level, in an integrated rather than independent way. Domain-specificity of consonants may depend on higher-level processes in the hierarchy of speech perception.  相似文献   

15.
A new method and application is proposed to characterize intensity and pitch of human heart sounds and murmurs. Using recorded heart sounds from the library of one of the authors, a visual map of heart sound energy was established. Both normal and abnormal heart sound recordings were studied. Representation is based on Wigner-Ville joint time-frequency transformations. The proposed methodology separates acoustic contributions of cardiac events simultaneously in pitch, time and energy. The resolution accuracy is superior to any other existing spectrogram method. The characteristic energy signature of the innocent heart murmur in a child with the S3 sound is presented. It allows clear detection of S1, S2 and S3 sounds, S2 split, systolic murmur, and intensity of these components. The original signal, heart sound power change with time, time-averaged frequency, energy density spectra and instantaneous variations of power and frequency/pitch with time, are presented. These data allow full quantitative characterization of heart sounds and murmurs. High accuracy in both time and pitch resolution is demonstrated. Resulting visual images have self-referencing quality, whereby individual features and their changes become immediately obvious.  相似文献   

16.
A common approach for determining musical competence is to rely on information about individuals’ extent of musical training, but relying on musicianship status fails to identify musically untrained individuals with musical skill, as well as those who, despite extensive musical training, may not be as skilled. To counteract this limitation, we developed a new test battery (Profile of Music Perception Skills; PROMS) that measures perceptual musical skills across multiple domains: tonal (melody, pitch), qualitative (timbre, tuning), temporal (rhythm, rhythm-to-melody, accent, tempo), and dynamic (loudness). The PROMS has satisfactory psychometric properties for the composite score (internal consistency and test-retest r>.85) and fair to good coefficients for the individual subtests (.56 to.85). Convergent validity was established with the relevant dimensions of Gordon’s Advanced Measures of Music Audiation and Musical Aptitude Profile (melody, rhythm, tempo), the Musical Ear Test (rhythm), and sample instrumental sounds (timbre). Criterion validity was evidenced by consistently sizeable and significant relationships between test performance and external musical proficiency indicators in all three studies (.38 to.62, p<.05 to p<.01). An absence of correlations between test scores and a nonmusical auditory discrimination task supports the battery’s discriminant validity (−.05, ns). The interrelationships among the various subtests could be accounted for by two higher order factors, sequential and sensory music processing. A brief version of the full PROMS is introduced as a time-efficient approximation of the full version of the battery.  相似文献   

17.
This work analyzed the perceptual attributes of natural dynamic audiovisual scenes. We presented thirty participants with 19 natural scenes in a similarity categorization task, followed by a semi-structured interview. The scenes were reproduced with an immersive audiovisual display. Natural scene perception has been studied mainly with unimodal settings, which have identified motion as one of the most salient attributes related to visual scenes, and sound intensity along with pitch trajectories related to auditory scenes. However, controlled laboratory experiments with natural multimodal stimuli are still scarce. Our results show that humans pay attention to similar perceptual attributes in natural scenes, and a two-dimensional perceptual map of the stimulus scenes and perceptual attributes was obtained in this work. The exploratory results show the amount of movement, perceived noisiness, and eventfulness of the scene to be the most important perceptual attributes in naturalistically reproduced real-world urban environments. We found the scene gist properties openness and expansion to remain as important factors in scenes with no salient auditory or visual events. We propose that the study of scene perception should move forward to understand better the processes behind multimodal scene processing in real-world environments. We publish our stimulus scenes as spherical video recordings and sound field recordings in a publicly available database.  相似文献   

18.
The presence of an intonational phrase boundary is often marked by three major acoustic cues: pause, final lengthening, and pitch reset. The present study investigates how these three acoustic cues are weighted in the perception of intonational phrase boundaries in two experiments. Sentences that contained two intonational phrases with a critical boundary between them were used as the experimental stimuli. The roles of the three acoustic cues at the critical boundary were manipulated in five conditions. The first condition featured none of the acoustic cues. The following three conditions featured only one cue each: pause, final lengthening, and pitch reset, respectively. The fifth condition featured both pause duration and pre-final lengthening. A baseline condition was also included in which all three acoustic cues were preserved intact. Listeners were asked to detect the presence of the critical boundaries in Experiment 1 and judge the strength of the critical boundaries in Experiment 2. The results of both experiments showed that listeners used all three acoustic cues in the perception of prosodic boundaries. More importantly, these acoustic cues were weighted differently across the two experiments: Pause was a more powerful perceptual cue than both final lengthening and pitch reset, with the latter two cues perceptually equivalent; the effect of pause and the effects of the other two acoustic cues were not additive. These results suggest that the weighting of acoustic cues contributes significantly to the perceptual differences of intonational phrase boundary.  相似文献   

19.
蚱蝉自鸣声的音色分为单音色、双音色.及三音色等.本文进一步阐明每种音色的变化及高幅值脉冲对主音色能量的影响.蚱蝉自鸣声音色的变化主要是指频谱主音色频率(MTF)的显著改变、蚱蝉单色自鸣声的MTF主要在4.1—5.8kHz的频带内变化,双音色自鸣声的主次音色频率有相互颠倒现象,MTF主要在3.6—5.4kHz之间,三音色自鸣声的MTF虽然在3.5—4.5kHz比较窄的频带内,但三个音色峰的能量十分接近显示了三种音色成分.同只蚱蝉自鸣声,在不同的鸣声段具有近似相等的最大幅值,但高幅值脉冲个数的多少不同,相应主音色能量的大小与这些脉冲个数的多少对应.  相似文献   

20.
Language and music epitomize the complex representational and computational capacities of the human mind. Strikingly similar in their structural and expressive features, a longstanding question is whether the perceptual and cognitive mechanisms underlying these abilities are shared or distinct – either from each other or from other mental processes. One prominent feature shared between language and music is signal encoding using pitch, conveying pragmatics and semantics in language and melody in music. We investigated how pitch processing is shared between language and music by measuring consistency in individual differences in pitch perception across language, music, and three control conditions intended to assess basic sensory and domain-general cognitive processes. Individuals’ pitch perception abilities in language and music were most strongly related, even after accounting for performance in all control conditions. These results provide behavioral evidence, based on patterns of individual differences, that is consistent with the hypothesis that cognitive mechanisms for pitch processing may be shared between language and music.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号