首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Models of speech production typically assume that control over the timing of speech movements is governed by the selection of higher-level linguistic units, such as segments or syllables. This study used real-time magnetic resonance imaging of the vocal tract to investigate the anticipatory movements speakers make prior to producing a vocal response. Two factors were varied: preparation (whether or not speakers had foreknowledge of the target response) and pre-response constraint (whether or not speakers were required to maintain a specific vocal tract posture prior to the response). In prepared responses, many speakers were observed to produce pre-response anticipatory movements with a variety of articulators, showing that that speech movements can be readily dissociated from higher-level linguistic units. Substantial variation was observed across speakers with regard to the articulators used for anticipatory posturing and the contexts in which anticipatory movements occurred. The findings of this study have important consequences for models of speech production and for our understanding of the normal range of variation in anticipatory speech behaviors.  相似文献   

2.
Mochida T  Gomi H  Kashino M 《PloS one》2010,5(11):e13866

Background

There has been plentiful evidence of kinesthetically induced rapid compensation for unanticipated perturbation in speech articulatory movements. However, the role of auditory information in stabilizing articulation has been little studied except for the control of voice fundamental frequency, voice amplitude and vowel formant frequencies. Although the influence of auditory information on the articulatory control process is evident in unintended speech errors caused by delayed auditory feedback, the direct and immediate effect of auditory alteration on the movements of articulators has not been clarified.

Methodology/Principal Findings

This work examined whether temporal changes in the auditory feedback of bilabial plosives immediately affects the subsequent lip movement. We conducted experiments with an auditory feedback alteration system that enabled us to replace or block speech sounds in real time. Participants were asked to produce the syllable /pa/ repeatedly at a constant rate. During the repetition, normal auditory feedback was interrupted, and one of three pre-recorded syllables /pa/, /Φa/, or /pi/, spoken by the same participant, was presented once at a different timing from the anticipated production onset, while no feedback was presented for subsequent repetitions. Comparisons of the labial distance trajectories under altered and normal feedback conditions indicated that the movement quickened during the short period immediately after the alteration onset, when /pa/ was presented 50 ms before the expected timing. Such change was not significant under other feedback conditions we tested.

Conclusions/Significance

The earlier articulation rapidly induced by the progressive auditory input suggests that a compensatory mechanism helps to maintain a constant speech rate by detecting errors between the internally predicted and actually provided auditory information associated with self movement. The timing- and context-dependent effects of feedback alteration suggest that the sensory error detection works in a temporally asymmetric window where acoustic features of the syllable to be produced may be coded.  相似文献   

3.
The potential role of a size-scaling principle in orofacial movements for speech was examined by using between-group (adults vs. 5-yr-old children) as well as within-group correlational analyses. Movements of the lower lip and jaw were recorded during speech production, and anthropometric measures of orofacial structures were made. Adult women produced speech movements of equal amplitude and velocity to those of adult men. The children produced speech movement amplitudes equal to those of adults, but they had significantly lower peak velocities of orofacial movement. Thus we found no evidence supporting a size-scaling principle for orofacial speech movements. Young children have a relatively large-amplitude, low-velocity movement strategy for speech production compared with young adults. This strategy may reflect the need for more time to plan speech movement sequences and an increased reliance on sensory feedback as young children develop speech motor control processes.  相似文献   

4.
According to a prominent view of sensorimotor processing in primates, selection and specification of possible actions are not sequential operations. Rather, a decision for an action emerges from competition between different movement plans, which are specified and selected in parallel. For action choices which are based on ambiguous sensory input, the frontoparietal sensorimotor areas are considered part of the common underlying neural substrate for selection and specification of action. These areas have been shown capable of encoding alternative spatial motor goals in parallel during movement planning, and show signatures of competitive value-based selection among these goals. Since the same network is also involved in learning sensorimotor associations, competitive action selection (decision making) should not only be driven by the sensory evidence and expected reward in favor of either action, but also by the subject''s learning history of different sensorimotor associations. Previous computational models of competitive neural decision making used predefined associations between sensory input and corresponding motor output. Such hard-wiring does not allow modeling of how decisions are influenced by sensorimotor learning or by changing reward contingencies. We present a dynamic neural field model which learns arbitrary sensorimotor associations with a reward-driven Hebbian learning algorithm. We show that the model accurately simulates the dynamics of action selection with different reward contingencies, as observed in monkey cortical recordings, and that it correctly predicted the pattern of choice errors in a control experiment. With our adaptive model we demonstrate how network plasticity, which is required for association learning and adaptation to new reward contingencies, can influence choice behavior. The field model provides an integrated and dynamic account for the operations of sensorimotor integration, working memory and action selection required for decision making in ambiguous choice situations.  相似文献   

5.
Speech production has been studied predominantly from within two traditions, psycholinguistics and motor control. These traditions have rarely interacted, and the resulting chasm between these approaches seems to reflect a level of analysis difference: whereas motor control is concerned with lower-level articulatory control, psycholinguistics focuses on higher-level linguistic processing. However, closer examination of both approaches reveals a substantial convergence of ideas. The goal of this article is to integrate psycholinguistic and motor control approaches to speech production. The result of this synthesis is a neuroanatomically grounded, hierarchical state feedback control model of speech production.  相似文献   

6.
A central challenge for articulatory speech synthesis is the simulation of realistic articulatory movements, which is critical for the generation of highly natural and intelligible speech. This includes modeling coarticulation, i.e., the context-dependent variation of the articulatory and acoustic realization of phonemes, especially of consonants. Here we propose a method to simulate the context-sensitive articulation of consonants in consonant-vowel syllables. To achieve this, the vocal tract target shape of a consonant in the context of a given vowel is derived as the weighted average of three measured and acoustically-optimized reference vocal tract shapes for that consonant in the context of the corner vowels /a/, /i/, and /u/. The weights are determined by mapping the target shape of the given context vowel into the vowel subspace spanned by the corner vowels. The model was applied for the synthesis of consonant-vowel syllables with the consonants /b/, /d/, /g/, /l/, /r/, /m/, /n/ in all combinations with the eight long German vowels. In a perception test, the mean recognition rate for the consonants in the isolated syllables was 82.4%. This demonstrates the potential of the approach for highly intelligible articulatory speech synthesis.  相似文献   

7.
Speech perception is thought to be linked to speech motor production. This linkage is considered to mediate multimodal aspects of speech perception, such as audio-visual and audio-tactile integration. However, direct coupling between articulatory movement and auditory perception has been little studied. The present study reveals a clear dissociation between the effects of a listener’s own speech action and the effects of viewing another’s speech movements on the perception of auditory phonemes. We assessed the intelligibility of the syllables [pa], [ta], and [ka] when listeners silently and simultaneously articulated syllables that were congruent/incongruent with the syllables they heard. The intelligibility was compared with a condition where the listeners simultaneously watched another’s mouth producing congruent/incongruent syllables, but did not articulate. The intelligibility of [ta] and [ka] were degraded by articulating [ka] and [ta] respectively, which are associated with the same primary articulator (tongue) as the heard syllables. But they were not affected by articulating [pa], which is associated with a different primary articulator (lips) from the heard syllables. In contrast, the intelligibility of [ta] and [ka] was degraded by watching the production of [pa]. These results indicate that the articulatory-induced distortion of speech perception occurs in an articulator-specific manner while visually induced distortion does not. The articulator-specific nature of the auditory-motor interaction in speech perception suggests that speech motor processing directly contributes to our ability to hear speech.  相似文献   

8.
9.
Natural rodent grooming and other instinctive behavior serves as a natural model of complex movement sequences. Rodent grooming has syntactic (rule-driven) sequences and more random movement patterns. Both incorporate the same movements--only the serial structure differs. Recordings of neural activity in the dorsolateral striatum and the substantia nigra pars reticulata indicate preferential activation during syntactic sequences over more random sequences. Neurons that are responsive during syntactic grooming sequences are often unresponsive or have reverse activation profiles during kinematically similar movements that occur in flexible or random grooming sequences. Few neurons could be categorized as strictly movement related--instead they were activated only in the context of particular sequential patterns of movements. Particular sequential patterns included "syntactic chain" grooming sequences of paw, head, and body movements and also "warm-up" sequences, which consist of head and body/limb movements that precede locomotion after a period of quiet resting (Golani 1992). Activation during warm-up was less intense and less frequent than during grooming sequences, but both sequences activated neurons above baseline levels, and the same neurons sometimes responded to both sequences. The fact that striatal neurons code 2 natural sequences which are made up of different constituent movements suggests that the basal ganglia may have a generalized role in sequence control. The basal ganglia are modulated by the context of the sequence and may play an executive function in the complex natural patterns of sequenced behaviour.  相似文献   

10.
This article describes a neural network model that addresses the acquisition of speaking skills by infants and subsequent motor equivalent production of speech sounds. The model learns two mappings during a babbling phase. A phonetic-to-orosensory mapping specifies a vocal tract target for each speech sound; these targets take the form of convex regions in orosensory coordinates defining the shape of the vocal tract. The babbling process wherein these convex region targets are formed explains how an infant can learn phoneme-specific and language-specific limits on acceptable variability of articulator movements. The model also learns an orosensory-to-articulatory mapping wherein cells coding desired movement directions in orosensory space learn articulator movements that achieve these orosensory movement directions. The resulting mapping provides a natural explanation for the formation of coordinative structures. This mapping also makes efficient use of redundancy in the articulator system, thereby providing the model with motor equivalent capabilities. Simulations verify the model's ability to compensate for constraints or perturbations applied to the articulators automatically and without new learning and to explain contextual variability seen in human speech production.Supported in part by AFOSR F49620-92-J-0499  相似文献   

11.
Tongue movements during speech production have been investigated by means of a simple yet realistic biomechanical model, based on a finite elements modeling of soft tissues, in the framework of the equilibrium point hypothesis (-model) of motor control. In particular, the model has been applied to the estimation of the “central” control commands issued to the muscles, for a data set of mid-sagittal digitized tracings of vocal tract shape, r ecorded by means of low-intensity X-ray cineradiographies during speech. In spite of the highly non-linear mapping between the shape of the oral cavity and its acoustic consequences, the organization of control commands preserves the peculiar spatial organization of vowel phonemes in acoustic space. A factor analysis of control commands, which have been decomposed into independent or “orthogonal” muscle groups, has shown that, in spite of the great mobility of the tongue and the highly complex arrangement of tongue muscles, its movements can be explained in terms of the activation of a small number of independent muscle groups, each corresponding to an elementary or “primitive” movement. These results are consistent with the hypothesis that the tongue is controlled by a small number of independent “articulators”, for which a precise biomechanical substrate is provided. The influence of the effect of jaw and hyoid movements on tongue equilibrium has also bee n evaluated, suggesting that the bony structures cannot be considered as a moving frame of reference, but, indeed, there may be a substantial interaction between them and the tongue, that may only be accounted for by a “global” model. The reported results also define a simple control model for the tongue and, in analogy with similar modelling studies, they suggest that, because of the peculiar geometrical arrangement of tongue muscles, the central nervous system (CNS) may not need a de tailed representation of tongue mechanics but rather may make use of a relatively small number of muscle synergies, that are invariant over the whole space of tongue configurations. Received: 27 August 1996 / Accepted in revised form: 25 February 1997  相似文献   

12.
Pei X  Hill J  Schalk G 《IEEE pulse》2012,3(1):43-46
From the 1980s movie Firefox to the more recent Avatar, popular science fiction has speculated about the possibility of a persons thoughts being read directly from his or her brain. Such braincomputer interfaces (BCIs) might allow people who are paralyzed to communicate with and control their environment, and there might also be applications in military situations wherever silent user-to-user communication is desirable. Previous studies have shown that BCI systems can use brain signals related to movements and movement imagery or attention-based character selection. Although these systems have successfully demonstrated the possibility to control devices using brain function, directly inferring which word a person intends to communicate has been elusive. A BCI using imagined speech might provide such a practical, intuitive device. Toward this goal, our studies to date addressed two scientific questions: (1) Can brain signals accurately characterize different aspects of speech? (2) Is it possible to predict spoken or imagined words or their components using brain signals?  相似文献   

13.
14.
The study of the production of co-speech gestures (CSGs), i.e., meaningful hand movements that often accompany speech during everyday discourse, provides an important opportunity to investigate the integration of language, action, and memory because of the semantic overlap between gesture movements and speech content. Behavioral studies of CSGs and speech suggest that they have a common base in memory and predict that overt production of both speech and CSGs would be preceded by neural activity related to memory processes. However, to date the neural correlates and timing of CSG production are still largely unknown. In the current study, we addressed these questions with magnetoencephalography and a semantic association paradigm in which participants overtly produced speech or gesture responses that were either meaningfully related to a stimulus or not. Using spectral and beamforming analyses to investigate the neural activity preceding the responses, we found a desynchronization in the beta band (15–25 Hz), which originated 900 ms prior to the onset of speech and was localized to motor and somatosensory regions in the cortex and cerebellum, as well as right inferior frontal gyrus. Beta desynchronization is often seen as an indicator of motor processing and thus reflects motor activity related to the hand movements that gestures add to speech. Furthermore, our results show oscillations in the high gamma band (50–90 Hz), which originated 400 ms prior to speech onset and were localized to the left medial temporal lobe. High gamma oscillations have previously been found to be involved in memory processes and we thus interpret them to be related to contextual association of semantic information in memory. The results of our study show that high gamma oscillations in medial temporal cortex play an important role in the binding of information in human memory during speech and CSG production.  相似文献   

15.
Existing theories of movement planning suggest that it takes time to select and prepare the actions required to achieve a given goal. These theories often appeal to circumstances where planning apparently goes awry. For instance, if reaction times are forced to be very low, movement trajectories are often directed between two potential targets. These intermediate movements are generally interpreted as errors of movement planning, arising either from planning being incomplete or from parallel movement plans interfering with one another. Here we present an alternative view: that intermediate movements reflect uncertainty about movement goals. We show how intermediate movements are predicted by an optimal feedback control model that incorporates an ongoing decision about movement goals. According to this view, intermediate movements reflect an exploitation of compatibility between goals. Consequently, reducing the compatibility between goals should reduce the incidence of intermediate movements. In human subjects, we varied the compatibility between potential movement goals in two distinct ways: by varying the spatial separation between targets and by introducing a virtual barrier constraining trajectories to the target and penalizing intermediate movements. In both cases we found that decreasing goal compatibility led to a decreasing incidence of intermediate movements. Our results and theory suggest a more integrated view of decision-making and movement planning in which the primary bottleneck to generating a movement is deciding upon task goals. Determining how to move to achieve a given goal is rapid and automatic.  相似文献   

16.
Methods for modeling sets of complex curves where the curves must be aligned in time (or in another continuous predictor) fall into the general class of functional data analysis and include self-modeling regression and time-warping procedures. Self-modeling regression (SEMOR), also known as a shape invariant model (SIM), assumes the curves have a common shape, modeled nonparametrically, and curve-specific differences in amplitude and timing, traditionally modeled by linear transformations. When curves contain multiple features that need to be aligned in time, SEMOR may be inadequate since a linear time transformation generally cannot align more than one feature. Time warping procedures focus on timing variability and on finding flexible time warps to align multiple data features. We draw on these methods to develop a SIM that models the time transformations as random, flexible, monotone functions. The model is motivated by speech movement data from the University of Wisconsin X-ray microbeam speech production project and is applied to these data to test the effect of different speaking conditions on the shape and relative timing of movement profiles.  相似文献   

17.
The differentiation of discrete and continuous movement is one of the pillars of motor behavior classification. Discrete movements have a definite beginning and end, whereas continuous movements do not have such discriminable end points. In the past decade there has been vigorous debate whether this classification implies different control processes. This debate up until the present has been empirically based. Here, we present an unambiguous non-empirical classification based on theorems in dynamical system theory that sets discrete and continuous movements apart. Through computational simulations of representative modes of each class and topological analysis of the flow in state space, we show that distinct control mechanisms underwrite discrete and fast rhythmic movements. In particular, we demonstrate that discrete movements require a time keeper while fast rhythmic movements do not. We validate our computational findings experimentally using a behavioral paradigm in which human participants performed finger flexion-extension movements at various movement paces and under different instructions. Our results demonstrate that the human motor system employs different timing control mechanisms (presumably via differential recruitment of neural subsystems) to accomplish varying behavioral functions such as speed constraints.  相似文献   

18.
Different kinds of articulators, such as the upper and lower lips, jaw, and tongue, are precisely coordinated in speech production. Based on a perturbation study of the production of a fricative consonant using the upper and lower lips, it has been suggested that increasing the stiffness in the muscle linkage between the upper lip and jaw is beneficial for maintaining the constriction area between the lips (Gomi et al. 2002). This hypothesis is crucial for examining the mechanism of speech motor control, that is, whether mechanical impedance is controlled for the speech motor coordination. To test this hypothesis, in the current study we performed a dynamical simulation of lip compensatory movements based on a muscle linkage model and then evaluated the performance of compensatory movements. The temporal pattern of stiffness of muscle linkage was obtained from the electromyogram (EMG) of the orbicularis oris superior (OOS) muscle by using the temporal transformation (second-order dynamics with time delay) from EMG to stiffness, whose parameters were experimentally determined. The dynamical simulation using stiffness estimated from empirical EMG successfully reproduced the temporal profile of the upper lip compensatory articulations. Moreover, the estimated stiffness variation significantly contributed to reproduce a functional modulation of the compensatory response. This result supports the idea that the mechanical impedance highly contributes to organizing coordination among the lips and jaw. The motor command would be programmed not only to generate movement in each articulator but also to regulate mechanical impedance among articulators for robust coordination of speech motor control.  相似文献   

19.
Transcranial magnetic stimulation (TMS) has proven to be a useful tool in investigating the role of the articulatory motor cortex in speech perception. Researchers have used single-pulse and repetitive TMS to stimulate the lip representation in the motor cortex. The excitability of the lip motor representation can be investigated by applying single TMS pulses over this cortical area and recording TMS-induced motor evoked potentials (MEPs) via electrodes attached to the lip muscles (electromyography; EMG). Larger MEPs reflect increased cortical excitability. Studies have shown that excitability increases during listening to speech as well as during viewing speech-related movements. TMS can be used also to disrupt the lip motor representation. A 15-min train of low-frequency sub-threshold repetitive stimulation has been shown to suppress motor excitability for a further 15-20 min. This TMS-induced disruption of the motor lip representation impairs subsequent performance in demanding speech perception tasks and modulates auditory-cortex responses to speech sounds. These findings are consistent with the suggestion that the motor cortex contributes to speech perception. This article describes how to localize the lip representation in the motor cortex and how to define the appropriate stimulation intensity for carrying out both single-pulse and repetitive TMS experiments.  相似文献   

20.
Yamazaki T  Nagao S 《PloS one》2012,7(3):e33319
Precise gain and timing control is the goal of cerebellar motor learning. Because the basic neural circuitry of the cerebellum is homogeneous throughout the cerebellar cortex, a single computational mechanism may be used for simultaneous gain and timing control. Although many computational models of the cerebellum have been proposed for either gain or timing control, few models have aimed to unify them. In this paper, we hypothesize that gain and timing control can be unified by learning of the complete waveform of the desired movement profile instructed by climbing fiber signals. To justify our hypothesis, we adopted a large-scale spiking network model of the cerebellum, which was originally developed for cerebellar timing mechanisms to explain the experimental data of Pavlovian delay eyeblink conditioning, to the gain adaptation of optokinetic response (OKR) eye movements. By conducting large-scale computer simulations, we could reproduce some features of OKR adaptation, such as the learning-related change of simple spike firing of model Purkinje cells and vestibular nuclear neurons, simulated gain increase, and frequency-dependent gain increase. These results suggest that the cerebellum may use a single computational mechanism to control gain and timing simultaneously.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号