首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 402 毫秒
1.
ABSTRACT: BACKGROUND: Most major genome projects and sequence databases provide a GO annotation of their data,either automatically or through human annotators, creating a large corpus of data written inthe language of GO. Texts written in natural language show a statistical power law behaviour,Zipf's law, the exponent of which can provide useful information on the nature of thelanguage being used. We have therefore explored the hypothesis that collections of GOannotations will show similar statistical behaviours to natural language. RESULTS: Annotations from the Gene Ontology Annotation project were found to follow Zipf's law.Surprisingly, the measured power law exponents were consistently different betweenannotation captured using the three GO sub-ontologies in the corpora (function, process andcomponent). On filtering the corpora using GO evidence codes we found that the value of themeasured power law exponent responded in a predictable way as a function of the evidencecodes used to support the annotation. CONCLUSIONS: Techniques from computational linguistics can provide new insights into the annotationprocess. GO annotations show similar statistical behaviours to those seen in natural languagewith measured exponents that provide a signal which correlates with the nature of the evidence codes used to support the annotations, suggesting that the measured exponent mightprovide a signal regarding the information content of the annotation.  相似文献   

2.
Although many species possess rudimentary communication systems, humans seem to be unique with regard to making use of syntax and symbolic reference. Recent approaches to the evolution of language formalize why syntax is selectively advantageous compared with isolated signal communication systems, but do not explain how signals naturally combine. Even more recent work has shown that if a communication system maximizes communicative efficiency while minimizing the cost of communication, or if a communication system constrains ambiguity in a non-trivial way while a certain entropy is maximized, signal frequencies will be distributed according to Zipf's law. Here we show that such communication principles give rise not only to signals that have many traits in common with the linking words in real human languages, but also to a rudimentary sort of syntax and symbolic reference.  相似文献   

3.
It is well-known that word frequencies arrange themselves according to Zipf''s law. However, little is known about the dependency of the parameters of the law and the complexity of a communication system. Many models of the evolution of language assume that the exponent of the law remains constant as the complexity of a communication systems increases. Using longitudinal studies of child language, we analysed the word rank distribution for the speech of children and adults participating in conversations. The adults typically included family members (e.g., parents) or the investigators conducting the research. Our analysis of the evolution of Zipf''s law yields two main unexpected results. First, in children the exponent of the law tends to decrease over time while this tendency is weaker in adults, thus suggesting this is not a mere mirror effect of adult speech. Second, although the exponent of the law is more stable in adults, their exponents fall below 1 which is the typical value of the exponent assumed in both children and adults. Our analysis also shows a tendency of the mean length of utterances (MLU), a simple estimate of syntactic complexity, to increase as the exponent decreases. The parallel evolution of the exponent and a simple indicator of syntactic complexity (MLU) supports the hypothesis that the exponent of Zipf''s law and linguistic complexity are inter-related. The assumption that Zipf''s law for word ranks is a power-law with a constant exponent of one in both adults and children needs to be revised.  相似文献   

4.
Chen Y 《PloS one》2011,6(9):e24791
Zipf's law is one the most conspicuous empirical facts for cities, however, there is no convincing explanation for the scaling relation between rank and size and its scaling exponent. Using the idea from general fractals and scaling, I propose a dual competition hypothesis of city development to explain the value intervals and the special value, 1, of the power exponent. Zipf's law and Pareto's law can be mathematically transformed into one another, but represent different processes of urban evolution, respectively. Based on the Pareto distribution, a frequency correlation function can be constructed. By scaling analysis and multifractals spectrum, the parameter interval of Pareto exponent is derived as (0.5, 1]; Based on the Zipf distribution, a size correlation function can be built, and it is opposite to the first one. By the second correlation function and multifractals notion, the Pareto exponent interval is derived as [1, 2). Thus the process of urban evolution falls into two effects: one is the Pareto effect indicating city number increase (external complexity), and the other the Zipf effect indicating city size growth (internal complexity). Because of struggle of the two effects, the scaling exponent varies from 0.5 to 2; but if the two effects reach equilibrium with each other, the scaling exponent approaches 1. A series of mathematical experiments on hierarchical correlation are employed to verify the models and a conclusion can be drawn that if cities in a given region follow Zipf's law, the frequency and size correlations will follow the scaling law. This theory can be generalized to interpret the inverse power-law distributions in various fields of physical and social sciences.  相似文献   

5.
The evolutionary language game.   总被引:1,自引:0,他引:1  
We explore how evolutionary game dynamics have to be modified to accomodate a mathematical framework for the evolution of language. In particular, we are interested in the evolution of vocabulary, that is associations between signals and objects. We assume that successful communication contributes to biological fitness: individuals who communicate well leave more offspring. Children inherit from their parents a strategy for language learning (a language acquisition device). We consider three mechanisms whereby language is passed from one generation to the next: (i) parental learning: children learn the language of their parents; (ii) role model learning: children learn the language of individuals with a high payoff; and (iii) random learning: children learn the language of randomly chosen individuals. We show that parental and role model learning outperform random learning. Then we introduce mistakes in language learning and study how this process changes language over time. Mistakes increase the overall efficacy of parental and role model learning: in a world with errors evolutionary adaptation is more efficient. Our model also provides a simple explanation why homonomy is common while synonymy is rare.  相似文献   

6.
Language is about words and rules. While there is some discussion to what extent rules are learned or innate, it is clear that words have to be learned. Here I construct a mathematical framework for the population dynamics of language evolution with particular emphasis on how words are propagated over generations. I define the basic reproductive ratio of word, R, and show that R > 1 is required for words to be maintained in the lexicon of a language. Assuming that the frequency distribution of words follow Zipf's law, an upper limit is obtained for the number of words in a language that relies exclusively on oral transmission.  相似文献   

7.
Human language is a complex communication system with unlimited expressibility. Children spontaneously develop a native language by exposure to linguistic data from their speech community. Over historical time, languages change dramatically and unpredictably by accumulation of small changes and by interaction with other languages. We have previously developed a mathematical model for the acquisition and evolution of language in heterogeneous populations of speakers. This model is based on game dynamical equations with learning. Here, we show that simple examples of such equations can display complex limit cycles and chaos. Hence, language dynamical equations mimic complicated and unpredictable changes of languages over time. In terms of evolutionary game theory, we note that imperfect learning can induce chaotic switching among strict Nash equilibria.  相似文献   

8.
Task-optimized convolutional neural networks (CNNs) show striking similarities to the ventral visual stream. However, human-imperceptible image perturbations can cause a CNN to make incorrect predictions. Here we provide insight into this brittleness by investigating the representations of models that are either robust or not robust to image perturbations. Theory suggests that the robustness of a system to these perturbations could be related to the power law exponent of the eigenspectrum of its set of neural responses, where power law exponents closer to and larger than one would indicate a system that is less susceptible to input perturbations. We show that neural responses in mouse and macaque primary visual cortex (V1) obey the predictions of this theory, where their eigenspectra have power law exponents of at least one. We also find that the eigenspectra of model representations decay slowly relative to those observed in neurophysiology and that robust models have eigenspectra that decay slightly faster and have higher power law exponents than those of non-robust models. The slow decay of the eigenspectra suggests that substantial variance in the model responses is related to the encoding of fine stimulus features. We therefore investigated the spatial frequency tuning of artificial neurons and found that a large proportion of them preferred high spatial frequencies and that robust models had preferred spatial frequency distributions more aligned with the measured spatial frequency distribution of macaque V1 cells. Furthermore, robust models were quantitatively better models of V1 than non-robust models. Our results are consistent with other findings that there is a misalignment between human and machine perception. They also suggest that it may be useful to penalize slow-decaying eigenspectra or to bias models to extract features of lower spatial frequencies during task-optimization in order to improve robustness and V1 neural response predictivity.  相似文献   

9.
In this review, three main experimental approaches for studying animal language behaviour are compared: (1) direct decoding of animals’ communication, (2) the use of intermediary languages to communicate with animals and (3) application of ideas and methods of the Information Theory for studying quantitative characteristics of animal communication. Each of the three methodological approaches has its specific power and specific limitations. Deciphering animals’ signals reveals a complex picture of natural communication in its evolutionary perspective but only fragmentary because of many methodological barriers, among which low repeatability of standard living situations seems to be a bottleneck. Language-training experiments are of great help for discovering potentials of animal language behaviour but leaves characteristics of their natural communications unclear. The use of the methods of Information Theory is based on measuring the time duration that animals spend on transmitting messages of definite information content and complexity. This approach, although does not reveal the nature of animals’ signals, provides a new dimension for studying important characteristics of natural communication systems, which have not been available before. First of all, this approach enables explorers of animals’ language behaviour to obtain knowledge just about the ability of subjects for transferring meaningful messages. Besides, the important properties of animal communication and intelligence can be evaluated such as the rate of information transmission, the complexity of transferred information and potential flexibility of communication systems.  相似文献   

10.
How communication systems emerge and remain stable is an important question in both cognitive science and evolutionary biology. For communication to arise, not only must individuals cooperate by signaling reliable information, but they must also coordinate and perpetuate signals. Most studies on the emergence of communication in humans typically consider scenarios where individuals implicitly share the same interests. Likewise, most studies on human cooperation consider scenarios where shared conventions of signals and meanings cannot be developed de novo. Here, we combined both approaches with an economic experiment where participants could develop a common language, but under different conditions fostering or hindering cooperation. Participants endeavored to acquire a resource through a learning task in a computer-based environment. After this task, participants had the option to transmit a signal (a color) to a fellow group member, who would subsequently play the same learning task. We varied the way participants competed with each other (either global scale or local scale) and the cost of transmitting a signal (either costly or noncostly) and tracked the way in which signals were used as communication among players. Under global competition, players signaled more often and more consistently, scored higher individual payoffs, and established shared associations of signals and meanings. In addition, costly signals were also more likely to be used under global competition; whereas under local competition, fewer signals were sent and no effective communication system was developed. Our results demonstrate that communication involves both a coordination and a cooperative dilemma and show the importance of studying language evolution under different conditions influencing human cooperation.  相似文献   

11.
Over the last million years, human language has emerged and evolved as a fundamental instrument of social communication and semiotic representation. People use language in part to convey emotional information, leading to the central and contingent questions: (1) What is the emotional spectrum of natural language? and (2) Are natural languages neutrally, positively, or negatively biased? Here, we report that the human-perceived positivity of over 10,000 of the most frequently used English words exhibits a clear positive bias. More deeply, we characterize and quantify distributions of word positivity for four large and distinct corpora, demonstrating that their form is broadly invariant with respect to frequency of word use.  相似文献   

12.
Duplication models for biological networks.   总被引:11,自引:0,他引:11  
Are biological networks different from other large complex networks? Both large biological and nonbiological networks exhibit power-law graphs (number of nodes with degree k, N(k) approximately k(-beta)), yet the exponents, beta, fall into different ranges. This may be because duplication of the information in the genome is a dominant evolutionary force in shaping biological networks (like gene regulatory networks and protein-protein interaction networks) and is fundamentally different from the mechanisms thought to dominate the growth of most nonbiological networks (such as the Internet). The preferential choice models used for nonbiological networks like web graphs can only produce power-law graphs with exponents greater than 2. We use combinatorial probabilistic methods to examine the evolution of graphs by node duplication processes and derive exact analytical relationships between the exponent of the power law and the parameters of the model. Both full duplication of nodes (with all their connections) as well as partial duplication (with only some connections) are analyzed. We demonstrate that partial duplication can produce power-law graphs with exponents less than 2, consistent with current data on biological networks. The power-law exponent for large graphs depends only on the growth process, not on the starting graph.  相似文献   

13.
Discriminative touch relies on afferent information carried to the central nervous system by action potentials (spikes) in ensembles of primary afferents bundled in peripheral nerves. These sensory quanta are first processed by the cuneate nucleus before the afferent information is transmitted to brain networks serving specific perceptual and sensorimotor functions. Here we report data on the integration of primary afferent synaptic inputs obtained with in vivo whole cell patch clamp recordings from the neurons of this nucleus. We find that the synaptic integration in individual cuneate neurons is dominated by 4–8 primary afferent inputs with large synaptic weights. In a simulation we show that the arrangement with a low number of primary afferent inputs can maximize transfer over the cuneate nucleus of information encoded in the spatiotemporal patterns of spikes generated when a human fingertip contact objects. Hence, the observed distributions of synaptic weights support high fidelity transfer of signals from ensembles of tactile afferents. Various anatomical estimates suggest that a cuneate neuron may receive hundreds of primary afferents rather than 4–8. Therefore, we discuss the possibility that adaptation of synaptic weight distribution, possibly involving silent synapses, may function to maximize information transfer in somatosensory pathways.  相似文献   

14.
Habitat fragmentation accompanies habitat loss, and drives additional biodiversity change; but few global biodiversity models explicitly analyse the effects of both fragmentation and loss. Here we propose and test the hypothesis that, as fragment area increases, species density (the number of species in a standardised plot) will scale with an exponent given by the difference between the exponents of the species–area relationships for islands (z ~ 0.25) and in contiguous habitat (z ~ 0.15), and test whether scaling varies between land uses. We also investigate the scaling of overall abundance and rarefaction‐based richness, as some mechanisms make different predictions about how fragment area should affect them. The relevant data from the taxonomically and geographically broad PREDICTS database were used to model the three diversity measures, testing their scaling with fragment area and whether the scaling exponent varied among land uses (primary forest, secondary forest, plantation forest, cropland and pasture). In addition, the consistency of the response of species density to fragment area was tested across three well represented taxa (Magnoliopsida, Hymenoptera and ‘herptiles’). Species density and total abundance showed area‐scaling exponents of 0.07 and 0.16, respectively, and these exponents did not vary significantly among land uses; rarefaction‐based richness by contrast did not increase consistently with area. These results suggest that the area‐scaling of species density is driven by the area‐scaling of total abundance, with additive edge effects (species moving into the small fragments from the surroundings) opposing – but not fully overcoming – the effect of fragment area on overall density of individuals. The interaction between fragment area and higher taxon (plants, vertebrates and invertebrates), which remained in the rarefied richness model, indicates that mechanisms may vary among groups.  相似文献   

15.
On the evolutionary trajectory that led to human language there must have been a transition from a fairly limited to an essentially unlimited communication system. The structure of modern human languages reveals at least two steps that are required for such a transition: in all languages (i) a small number of phonemes are used to generate a large number of words; and (ii) a large number of words are used to a produce an unlimited number of sentences. The first (and simpler) step is the topic of the current paper. We study the evolution of communication in the presence of errors and show that this limits the number of objects (or concepts) that can be described by a simple communication system. The evolutionary optimum is achieved by using only a small number of signals to describe a few valuable concepts. Adding more signals does not increase the fitness of a language. This represents an error limit for the evolution of communication. We show that this error limit can be overcome by combining signals (phonemes) into words. The transition from an analogue to a digital system was a necessary step toward the evolution of human language.  相似文献   

16.

Background  

Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented.  相似文献   

17.
Language transfers information on at least three levels; (1) what is said, (2) how it is said (what language is used), and, (3) that it is said (that speaker and listener both possess the ability to use language). The use of language is a form of honest cooperation on two of these levels; not necessarily on what is said, which can be deceitful, but always on how it is said and that it is said. This means that the language encoding and decoding systems had to evolve simultaneously, through mutual fitness benefits. Theoretical problems surrounding the evolution of cooperation disappear if a recognition system is present enabling cooperating individuals to identify each other – if they are equipped with “green beards”. Here, I outline how both the biological and cultural aspects of language are bestowed with such recognition systems. The biological capacities required for language signal their presence through speech and understanding. This signaling cannot be invaded by “false green beards” because the traits and the signal of their presence are one and the same. However, the real usefulness of language comes from its potential to convey an infinite number of meanings through the dynamic handling of symbols – through language itself. But any specific language also signals its presence to others through usage and understanding. Thus, languages themselves cannot be invaded by “false green beards” because, again, the trait and the signal of its presence are one and the same. These twin green beards, in both the biological and cultural realms, are unique to language.  相似文献   

18.
19.
Mody  Istvan 《Neurochemical research》2001,26(8-9):907-913
Cell-to-cell communication in the mammalian nervous system does not solely involve direct synaptic transmission. There is considerable evidence for a type of communication between neurons through chemical means that lies somewhere between the rapid synaptic information transfer and the relatively non-specific neuroendocrine secretion. Here I review some of the experimental evidence accumulated for the GABA system indicating that GABAA receptor-gated Cl-channels localized at synapses differ significantly from those found extrasynaptically. These two types of GABAA receptor are involved in generating distinctly different conductances. Thus, the development and search for pharmacological agents specifically aimed at selectively altering synaptic and extrasynaptic GABAA conductances is within reach, and is expected to provide novel insights into the regulation of neuronal excitability.  相似文献   

20.
Communication in nature is not restricted to the transmitter-receiver pair. Unintended listeners, or eavesdroppers, can intercept the signal and possibly utilize the received information to their benefit, which may confer a certain cost to the communicating pair. In this paper we explore (computationally and mathematically) such situations with the goal of uncovering their effect on language evolution. We find that in the presence of eavesdropping, languages exhibit a tendency to become more complex. On the other hand, if eavesdroppers belong to a different (competing) population, the languages used by the two populations tend to converge, if the cost of eavesdropping is sufficiently high; otherwise the languages synchronize. These findings are discussed in the context of animal communication and human language. In particular, the emergence of synonyms is predicted. We demonstrate that a small associated cost can suppress synonyms in the absence of eavesdropping, but that their likelihood increases strongly with the probability of eavesdropping.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号