首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 828 毫秒
1.
The perception of a letter in the context of a word is easier than in the context of a random letter sequence. It appears that our knowledge about words can influence our perception process. McClelland and Rumelhart (1981) propose an interactive activation model to account for the interaction between our knowledge about words and our visual input. They use their model to explain how these interactions facilitate perception. In their account, word context effect is a constant independent of the identity of the words. In this paper, we propose the use of informatin theory to quantify word context effect. In this way, the strength of word context effect will depend on the identity of the words. We apply the method to quantify word context effect in Chinese words. This knowledge is encoded in an artificial neural network using the interactive activation and competition model. The network is used to recognize Chinese characters and we are able to achieve a high recognition rate.  相似文献   

2.
Herbert C  Kübler A 《PloS one》2011,6(10):e25574
The present study investigated event-related brain potentials elicited by true and false negated statements to evaluate if discrimination of the truth value of negated information relies on conscious processing and requires higher-order cognitive processing in healthy subjects across different levels of stimulus complexity. The stimulus material consisted of true and false negated sentences (sentence level) and prime-target expressions (word level). Stimuli were presented acoustically and no overt behavioral response of the participants was required. Event-related brain potentials to target words preceded by true and false negated expressions were analyzed both within group and at the single subject level. Across the different processing conditions (word pairs and sentences), target words elicited a frontal negativity and a late positivity in the time window from 600-1000 msec post target word onset. Amplitudes of both brain potentials varied as a function of the truth value of the negated expressions. Results were confirmed at the single-subject level. In sum, our results support recent suggestions according to which evaluation of the truth value of a negated expression is a time- and cognitively demanding process that cannot be solved automatically, and thus requires conscious processing. Our paradigm provides insight into higher-order processing related to language comprehension and reasoning in healthy subjects. Future studies are needed to evaluate if our paradigm also proves sensitive for the detection of consciousness in non-responsive patients.  相似文献   

3.
According to the complementary learning systems (CLS) account of word learning, novel words are rapidly acquired (learning system 1), but slowly integrated into the mental lexicon (learning system 2). This two-step learning process has been shown to apply to novel word forms. In this study, we investigated whether novel word meanings are also gradually integrated after acquisition by measuring the extent to which newly learned words were able to prime semantically related words at two different time points. In addition, we investigated whether modality at study modulates this integration process. Sixty-four adult participants studied novel words together with written or spoken definitions. These words did not prime semantically related words directly following study, but did so after a 24-hour delay. This significant increase in the magnitude of the priming effect suggests that semantic integration occurs over time. Overall, words that were studied with a written definition showed larger priming effects, suggesting greater integration for the written study modality. Although the process of integration, reflected as an increase in the priming effect over time, did not significantly differ between study modalities, words studied with a written definition showed the most prominent positive effect after a 24-hour delay. Our data suggest that semantic integration requires time, and that studying in written format benefits semantic integration more than studying in spoken format. These findings are discussed in light of the CLS theory of word learning.  相似文献   

4.
Language as an evolving word web.   总被引:4,自引:0,他引:4  
Human language may be described as a complex network of linked words. In such a treatment, each distinct word in language is a vertex of this web, and interacting words in sentences are connected by edges. The empirical distribution of the number of connections of words in this network is of a peculiar form that includes two pronounced power-law regions. Here we propose a theory of the evolution of language, which treats language as a self-organizing network of interacting words. In the framework of this concept, we completely describe the observed word web structure without any fitting. We show that the two regimes in the distribution naturally emerge from the evolutionary dynamics of the word web. It follows from our theory that the size of the core part of language, the 'kernel lexicon', does not vary as language evolves.  相似文献   

5.
WSE, a new sequence distance measure based on word frequencies   总被引:1,自引:0,他引:1  
In this article, we present a new distance metric, the Weighted Sequence Entropy (WSE), based on the short word composition of biological sequences. As a revision of the classical relative entropy (RE), our metric (1) works equivalently with RE in the case of small k, (2) avoids the degeneracy when some word types are absent in one sequence but not in the other. Experiments on 25 viruses including SARS-CoVs show that our method and RE give exactly the same phylogenetic tree when word length k3. When k>3, our method still works and gets convergent phylogenetic topology but the RE gives degenerate results.  相似文献   

6.
We conduct a detailed investigation of the relationship among the obesity rate of urban areas and expressions of happiness, diet and physical activity on social media. We do so by analyzing a massive, geo-tagged data set comprising over 200 million words generated over the course of 2012 and 2013 on the social network service Twitter. Among many results, we show that areas with lower obesity rates: (1) have happier tweets and frequently discuss (2) food, particularly fruits and vegetables, and (3) physical activities of any intensity. Additionally, we provide evidence that each of these results offer different and unique insight into the variation of the obesity rate in urban areas within the United States. Our work shows how the contents of social media may potentially be used to estimate real-time, population-scale measures of factors related to obesity.  相似文献   

7.

Background

Word frequency is the most important variable in language research. However, despite the growing interest in the Chinese language, there are only a few sources of word frequency measures available to researchers, and the quality is less than what researchers in other languages are used to.

Methodology

Following recent work by New, Brysbaert, and colleagues in English, French and Dutch, we assembled a database of word and character frequencies based on a corpus of film and television subtitles (46.8 million characters, 33.5 million words). In line with what has been found in the other languages, the new word and character frequencies explain significantly more of the variance in Chinese word naming and lexical decision performance than measures based on written texts.

Conclusions

Our results confirm that word frequencies based on subtitles are a good estimate of daily language exposure and capture much of the variance in word processing efficiency. In addition, our database is the first to include information about the contextual diversity of the words and to provide good frequency estimates for multi-character words and the different syntactic roles in which the words are used. The word frequencies are freely available for research purposes.  相似文献   

8.
We propose a new model-based approach linking word learning to the age of acquisition (AoA) of words; a new computational tool for understanding the relationships among word learning processes, psychological attributes, and word AoAs as measures of vocabulary growth. The computational model developed describes the distinct statistical relationships between three theoretical factors underpinning word learning and AoA distributions. Simply put, this model formulates how different learning processes, characterized by change in learning rate over time and/or by the number of exposures required to acquire a word, likely result in different AoA distributions depending on word type. We tested the model in three respects. The first analysis showed that the proposed model accounts for empirical AoA distributions better than a standard alternative. The second analysis demonstrated that the estimated learning parameters well predicted the psychological attributes, such as frequency and imageability, of words. The third analysis illustrated that the developmental trend predicted by our estimated learning parameters was consistent with relevant findings in the developmental literature on word learning in children. We further discuss the theoretical implications of our model-based approach.  相似文献   

9.
What is abstraction? In our view, abstraction is generalization. Specifically, we propose that abstract concepts emerge as the natural product of associative learning and generalization by similarity. We support this proposal by presenting evidence for two ideas: first, that children''s knowledge about how categories are organized and how words refer to them can be explained as learned generalizations over specific experiences of words referring to categories; and second, that the path of concepts from concrete to more abstract can be observed throughout development and that even in their more abstract form, concepts retain some of their original sensory basis. We illustrate these two facts by examining, in two kinds of learners--networks and young children--the development of three abstract ideas: (i) the idea of word; (ii) the idea of object; and (iii) the idea of substance.  相似文献   

10.
Biological macromolecules such as DNA, RNA, and proteins can be regarded as finite sequences of symbols (or words) over a finite alphabet. In this paper, we refer to DNA (RNA) sequences which are words on a four-letter alphabet. A comparison is made between some "genes", or fragments of them, with random sequences or random reshuffled sequences on the same alphabet and having the same length. Some combinatorial techniques of analysis of finite words are developed. A crucial role in the comparison is played by the so-called special factors of a given word. In all the analysed DNA (RNA) fragments the distribution on the length of the number of right (left) special factors differs, in a very typical way, from the corresponding distribution in a string on the same alphabet and having the same length generated by a random source or obtained by making a random alteration (=shuffling) of the original string. This kind of change is irrespective of the length in the range that we have considered <2650 bp and of the phylogenetic origin of the fragment.  相似文献   

11.
This article aims at investigating the linguistic criteria to determine what a word is in Wichi (Matacoan), a polysynthetic and agglutinative language spoken in the Gran Chaco Region, in South America. The main phonological criteria proposed are phonological rules and stress. We also apply some grammatical criteria that have been proposed cross linguistically, some of which are useful to determine the boundaries of grammatical words in Wichi. Finally, we explore the relationship between the phonological and grammatical word with the written word. We base our analysis of written words on a textbook (Tsalanawu) used in many bilingual schools in Northeastern Argentina.  相似文献   

12.
Rapid identification of facial expressions can profoundly affect social interactions, yet most research to date has focused on static rather than dynamic expressions. In four experiments, we show that when a non-expressive face becomes expressive, happiness is detected more rapidly anger. When the change occurs peripheral to the focus of attention, however, dynamic anger is better detected when it appears in the left visual field (LVF), whereas dynamic happiness is better detected in the right visual field (RVF), consistent with hemispheric differences in the processing of approach- and avoidance-relevant stimuli. The central advantage for happiness is nevertheless the more robust effect, persisting even when information of either high or low spatial frequency is eliminated. Indeed, a survey of past research on the visual search for emotional expressions finds better support for a happiness detection advantage, and the explanation may lie in the coevolution of the signal and the receiver.  相似文献   

13.
Patterns of word use both reflect and influence a myriad of human activities and interactions. Like other entities that are reproduced and evolve, words rise or decline depending upon a complex interplay between their intrinsic properties and the environments in which they function. Using Internet discussion communities as model systems, we define the concept of a word niche as the relationship between the word and the characteristic features of the environments in which it is used. We develop a method to quantify two important aspects of the size of the word niche: the range of individuals using the word and the range of topics it is used to discuss. Controlling for word frequency, we show that these aspects of the word niche are strong determinants of changes in word frequency. Previous studies have already indicated that word frequency itself is a correlate of word success at historical time scales. Our analysis of changes in word frequencies over time reveals that the relative sizes of word niches are far more important than word frequencies in the dynamics of the entire vocabulary at shorter time scales, as the language adapts to new concepts and social groupings. We also distinguish endogenous versus exogenous factors as additional contributors to the fates of words, and demonstrate the force of this distinction in the rise of novel words. Our results indicate that short-term nonstationarity in word statistics is strongly driven by individual proclivities, including inclinations to provide novel information and to project a distinctive social identity.  相似文献   

14.
The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.  相似文献   

15.
In this paper, we propose two metrics to compare DNA and protein sequences based on a Poisson model of word occurrences. Instead of comparing the frequencies of all fixed-length words in two sequences, we consider (1) the probability of ‘generating’ one sequence under the Poisson model estimated from the other; (2) their different expression levels of words. Phylogenetic trees of 25 viruses including SARS-CoVs are constructed to illustrate our approach.  相似文献   

16.
Kim AS  Binns MA  Alain C 《PloS one》2012,7(4):e34856
Although many types of learning require associations to be formed, little is known about the brain mechanisms engaged in association formation. In the present study, we measured event-related potentials (ERPs) while participants studied pairs of semantically related words, with each word of a pair presented sequentially. To narrow in on the associative component of the signal, the ERP difference between the first and second words of a pair (Word2-Word1) was derived separately for subsequently recalled and subsequently not-recalled pairs. When the resulting difference waveforms were contrasted, a parietal positivity was observed for subsequently recalled pairs around 460 ms after the word presentation onset, followed by a positive slow wave that lasted until around 845 ms. Together these results suggest that associations formed between semantically related words are correlated with a specific neural signature that is reflected in scalp recordings over the parietal region.  相似文献   

17.
Protein point mutations are an essential component of the evolutionary and experimental analysis of protein structure and function. While many manually curated databases attempt to index point mutations, most experimentally generated point mutations and the biological impacts of the changes are described in the peer-reviewed published literature. We describe an application, Mutation GraB (Graph Bigram), that identifies, extracts, and verifies point mutations from biomedical literature. The principal problem of point mutation extraction is to link the point mutation with its associated protein and organism of origin. Our algorithm uses a graph-based bigram traversal to identify these relevant associations and exploits the Swiss-Prot protein database to verify this information. The graph bigram method is different from other models for point mutation extraction in that it incorporates frequency and positional data of all terms in an article to drive the point mutation–protein association. Our method was tested on 589 articles describing point mutations from the G protein–coupled receptor (GPCR), tyrosine kinase, and ion channel protein families. We evaluated our graph bigram metric against a word-proximity metric for term association on datasets of full-text literature in these three different protein families. Our testing shows that the graph bigram metric achieves a higher F-measure for the GPCRs (0.79 versus 0.76), protein tyrosine kinases (0.72 versus 0.69), and ion channel transporters (0.76 versus 0.74). Importantly, in situations where more than one protein can be assigned to a point mutation and disambiguation is required, the graph bigram metric achieves a precision of 0.84 compared with the word distance metric precision of 0.73. We believe the graph bigram search metric to be a significant improvement over previous search metrics for point mutation extraction and to be applicable to text-mining application requiring the association of words.  相似文献   

18.
Bilingualism provides a unique opportunity for understanding the relative roles of proficiency and order of acquisition in determining how the brain represents language. In a previous study, we combined magnetoencephalography (MEG) and magnetic resonance imaging (MRI) to examine the spatiotemporal dynamics of word processing in a group of Spanish-English bilinguals who were more proficient in their native language. We found that from the earliest stages of lexical processing, words in the second language evoke greater activity in bilateral posterior visual regions, while activity to the native language is largely confined to classical left hemisphere fronto-temporal areas. In the present study, we sought to examine whether these effects relate to language proficiency or order of language acquisition by testing Spanish-English bilingual subjects who had become dominant in their second language. Additionally, we wanted to determine whether activity in bilateral visual regions was related to the presentation of written words in our previous study, so we presented subjects with both written and auditory words. We found greater activity for the less proficient native language in bilateral posterior visual regions for both the visual and auditory modalities, which started during the earliest word encoding stages and continued through lexico-semantic processing. In classical left fronto-temporal regions, the two languages evoked similar activity. Therefore, it is the lack of proficiency rather than secondary acquisition order that determines the recruitment of non-classical areas for word processing.  相似文献   

19.
Efficient detection of unusual words.   总被引:3,自引:0,他引:3  
Words that are, by some measure, over- or underrepresented in the context of larger sequences have been variously implicated in biological functions and mechanisms. In most approaches to such anomaly detections, the words (up to a certain length) are enumerated more or less exhaustively and are individually checked in terms of observed and expected frequencies, variances, and scores of discrepancy and significance thereof. Here we take the global approach of annotating the suffix tree of a sequence with some such values and scores, having in mind to use it as a collective detector of all unexpected behaviors, or perhaps just as a preliminary filter for words suspicious enough to undergo a more accurate scrutiny. We consider in depth the simple probabilistic model in which sequences are produced by a random source emitting symbols from a known alphabet independently and according to a given distribution. Our main result consists of showing that, within this model, full tree annotations can be carried out in a time-and-space optimal fashion for the mean, variance and some of the adopted measures of significance. This result is achieved by an ad hoc embedding in statistical expressions of the combinatorial structure of the periods of a string. Specifically, we show that the expected value and variance of all substrings in a given sequence of n symbols can be computed and stored in (optimal) O(n2) overall worst-case, O (n log n) expected time and space. The O (n2) time bound constitutes an improvement by a linear factor over direct methods. Moreover, we show that under several accepted measures of deviation from expected frequency, the candidates over- or underrepresented words are restricted to the O(n) words that end at internal nodes of a compact suffix tree, as opposed to the theta(n2) possible substrings. This surprising fact is a consequence of properties in the form that if a word that ends in the middle of an arc is, say, overrepresented, then its extension to the nearest node of the tree is even more so. Based on this, we design global detectors of favored and unfavored words for our probabilistic framework in overall linear time and space, discuss related software implementations and display the results of preliminary experiments.  相似文献   

20.
Despite a growing number of studies, the neurophysiology of adult vocabulary acquisition is still poorly understood. One reason is that paradigms that can easily be combined with neuroscientfic methods are rare. Here, we tested the efficiency of two paradigms for vocabulary (re-) acquisition, and compared the learning of novel words for actions and objects. Cortical networks involved in adult native-language word processing are widespread, with differences postulated between words for objects and actions. Words and what they stand for are supposed to be grounded in perceptual and sensorimotor brain circuits depending on their meaning. If there are specific brain representations for different word categories, we hypothesized behavioural differences in the learning of action-related and object-related words. Paradigm A, with the learning of novel words for body-related actions spread out over a number of days, revealed fast learning of these new action words, and stable retention up to 4 weeks after training. The single-session Paradigm B employed objects and actions. Performance during acquisition did not differ between action-related and object-related words (time*word category: p?=?0.01), but the translation rate was clearly better for object-related (79%) than for action-related words (53%, p?=?0.002). Both paradigms yielded robust associative learning of novel action-related words, as previously demonstrated for object-related words. Translation success differed for action- and object-related words, which may indicate different neural mechanisms. The paradigms tested here are well suited to investigate such differences with neuroscientific means. Given the stable retention and minimal requirements for conscious effort, these learning paradigms are promising for vocabulary re-learning in brain-lesioned people. In combination with neuroimaging, neuro-stimulation or pharmacological intervention, they may well advance the understanding of language learning to optimize therapeutic strategies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号