共查询到20条相似文献,搜索用时 0 毫秒
1.
ABSTRACT: BACKGROUND: Most major genome projects and sequence databases provide a GO annotation of their data,either automatically or through human annotators, creating a large corpus of data written inthe language of GO. Texts written in natural language show a statistical power law behaviour,Zipf's law, the exponent of which can provide useful information on the nature of thelanguage being used. We have therefore explored the hypothesis that collections of GOannotations will show similar statistical behaviours to natural language. RESULTS: Annotations from the Gene Ontology Annotation project were found to follow Zipf's law.Surprisingly, the measured power law exponents were consistently different betweenannotation captured using the three GO sub-ontologies in the corpora (function, process andcomponent). On filtering the corpora using GO evidence codes we found that the value of themeasured power law exponent responded in a predictable way as a function of the evidencecodes used to support the annotation. CONCLUSIONS: Techniques from computational linguistics can provide new insights into the annotationprocess. GO annotations show similar statistical behaviours to those seen in natural languagewith measured exponents that provide a signal which correlates with the nature of the evidence codes used to support the annotations, suggesting that the measured exponent mightprovide a signal regarding the information content of the annotation. 相似文献
2.
Piqueira JR Fagali GM de Pinho M Sanches RF Del-Ben CM Zuardi AW 《Journal of theoretical biology》2007,247(1):182-185
The present study is a trial on expressing the whole state of a psychiatric ward as a linear combination of base states in a Hilbert space. Real data were collected by observing the behavior of the patients from the psychiatric ward of the Clinical Hospital from Faculdade de Medicina de Ribeir?o Preto for 12 days and, according to standard procedures, 18 behavioral parameters were daily measured for each patient. The whole data set was analyzed and, by taking the standard grades as eigenstates, the state of the ward was daily expressed by a linear combination of them, allowing the estimation of state transition matrices and of the quantum variability measure. Coefficients of the linear combination can be interpreted as square roots of probabilities and informational entropy is associated to each state resulting in the classical variability measure. Temporal evolutions of the classical and quantum variability measures are plotted trying to relate them to the behavioral state of the whole ward. 相似文献
3.
4.
Background
Zipf''s law and Heaps'' law are observed in disparate complex systems. Of particular interests, these two laws often appear together. Many theoretical models and analyses are performed to understand their co-occurrence in real systems, but it still lacks a clear picture about their relation.Methodology/Principal Findings
We show that the Heaps'' law can be considered as a derivative phenomenon if the system obeys the Zipf''s law. Furthermore, we refine the known approximate solution of the Heaps'' exponent provided the Zipf''s exponent. We show that the approximate solution is indeed an asymptotic solution for infinite systems, while in the finite-size system the Heaps'' exponent is sensitive to the system size. Extensive empirical analysis on tens of disparate systems demonstrates that our refined results can better capture the relation between the Zipf''s and Heaps'' exponents.Conclusions/Significance
The present analysis provides a clear picture about the relation between the Zipf''s law and Heaps'' law without the help of any specific stochastic model, namely the Heaps'' law is indeed a derivative phenomenon from the Zipf''s law. The presented numerical method gives considerably better estimation of the Heaps'' exponent given the Zipf''s exponent and the system size. Our analysis provides some insights and implications of real complex systems. For example, one can naturally obtained a better explanation of the accelerated growth of scale-free networks. 相似文献5.
6.
i Cancho RF Riordan O Bollobás B 《Proceedings. Biological sciences / The Royal Society》2005,272(1562):561-565
Although many species possess rudimentary communication systems, humans seem to be unique with regard to making use of syntax and symbolic reference. Recent approaches to the evolution of language formalize why syntax is selectively advantageous compared with isolated signal communication systems, but do not explain how signals naturally combine. Even more recent work has shown that if a communication system maximizes communicative efficiency while minimizing the cost of communication, or if a communication system constrains ambiguity in a non-trivial way while a certain entropy is maximized, signal frequencies will be distributed according to Zipf's law. Here we show that such communication principles give rise not only to signals that have many traits in common with the linking words in real human languages, but also to a rudimentary sort of syntax and symbolic reference. 相似文献
7.
Urban scaling relations characterizing how diverse properties of cities vary on average with their population size have recently been shown to be a general quantitative property of many urban systems around the world. However, in previous studies the statistics of urban indicators were not analyzed in detail, raising important questions about the full characterization of urban properties and how scaling relations may emerge in these larger contexts. Here, we build a self-consistent statistical framework that characterizes the joint probability distributions of urban indicators and city population sizes across an urban system. To develop this framework empirically we use one of the most granular and stochastic urban indicators available, specifically measuring homicides in cities of Brazil, Colombia and Mexico, three nations with high and fast changing rates of violent crime. We use these data to derive the conditional probability of the number of homicides per year given the population size of a city. To do this we use Bayes' rule together with the estimated conditional probability of city size given their number of homicides and the distribution of total homicides. We then show that scaling laws emerge as expectation values of these conditional statistics. Knowledge of these distributions implies, in turn, a relationship between scaling and population size distribution exponents that can be used to predict Zipf's exponent from urban indicator statistics. Our results also suggest how a general statistical theory of urban indicators may be constructed from the stochastic dynamics of social interaction processes in cities. 相似文献
8.
Using a measure of how differentially expressed a gene is in two biochemically/phenotypically different conditions, we can rank all genes in a microarray dataset. We have shown that the falling-off of this measure (normalized maximum likelihood in a classification model such as logistic regression) as a function of the rank is typically a power-law function. This power-law function in other similar ranked plots are known as the Zipf's law, observed in many natural and social phenomena. The presence of this power-law function prevents an intrinsic cutoff point between the "important" genes and "irrelevant" genes. We have shown that similar power-law functions are also present in permuted dataset, and provide an explanation from the well-known chi(2) distribution of likelihood ratios. We discuss the implication of this Zipf's law on gene selection in a microarray data analysis, as well as other characterizations of the ranked likelihood plots such as the rate of fall-off of the likelihood. 相似文献
9.
10.
Because admission to a regional child and adolescent psychiatric unit is often fraught with difficulties children with psychiatric disorders were admitted to a general children''s ward. Over the four years (1980-4) 24 patients accounted for 31 admissions. Of these, five had feeding disorders (anorexia, bulimia), seven neuroses, three psychoses, four elimination disorders, and five other diagnoses. All the children were later discharged to their homes, most having appreciably improved. Because of the proximity of the hospital to the child''s natural environment work with the families and schools was not interrupted by the admission. The results of this approach are encouraging and could have implications for future planning of services for this category of patients. 相似文献
11.
Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such sensations, the shape of the power spectrum has to be encoded in a way that preserves certain physical and perceptual properties. Therefore, it is common practice to encode short-time power spectra using psychoacoustical frequency scales. In this paper, we study and characterize the statistical properties of such encodings, here called timbral code-words. In particular, we report on rank-frequency distributions of timbral code-words extracted from 740 hours of audio coming from disparate sources such as speech, music, and environmental sounds. Analogously to text corpora, we find a heavy-tailed Zipfian distribution with exponent close to one. Importantly, this distribution is found independently of different encoding decisions and regardless of the audio source. Further analysis on the intrinsic characteristics of most and least frequent code-words reveals that the most frequent code-words tend to have a more homogeneous structure. We also find that speech and music databases have specific, distinctive code-words while, in the case of the environmental sounds, this database-specific code-words are not present. Finally, we find that a Yule-Simon process with memory provides a reasonable quantitative approximation for our data, suggesting the existence of a common simple generative mechanism for all considered sound sources. 相似文献
12.
Frequency distribution of word usage in a word sequence generated by capping is estimated in terms of the number of "hits" in retrieval of web-pages, to evaluate structure of semantics proper not to a particular text but to a language. Especially we compare distribution of English sequences with Japanese ones and obtain that, for English and Japanese phonogram, frequency of word usage against rank follows power-law function with exponent 1 and, for Japanese ideogram, it follows stretched exponential (Weibull distribution) function. We also discuss that such a difference can result from difference of phonogram based- (English) and ideogram-based language (Japanese). 相似文献
13.
14.
D R Harper 《BMJ (Clinical research ed.)》1979,1(6164):647-649
Data relating to the cost of caring for individual patients were collected for all patients in a general surgical ward over a six-month period. From this the cost per patient was calculated for various diseases and was found to be related to duration of stay. Postoperative morbidity was important in determining cost. A system that calculates cost by means of units based on the use of resources rather than by cash cost accounting is probably the most suitable for a clinician who has to monitor resources. 相似文献
15.
In March 1984 a short term respite care facility for handicapped children was opened in a children''s ward catering primarily for acute medical and surgical problems. The facility was based on a four bedded room designed so that if beds became short in the main ward it could revert immediately to the care of acutely sick children. Three nurses were appointed specifically to staff the facility, the nursing budget for the rest of the ward being reduced proportionately. Conversions were funded by charities and some of the conversion work done by volunteers. The main users were totally dependent children aged under 5 with severe mental and physical handicaps. Parents found the service invaluable, and in addition to planned admissions it was usually possible to accept a child at short notice--for example, when some domestic crisis occurred. Only very rarely was admission impossible because of the needs of acutely ill children. A short term respite care facility not only helps parents cope and may provide beneficial experience for a handicapped child but is also a useful training ground for medical students and junior staff. 相似文献
16.
17.
18.
19.
A ward has been set up for adolescents, who, being neither children nor adults, have special needs. It provides a pleasant and enthusiastic atmosphere that allows the patients to mix together socially alties is important, but not more than 20% should be long-stay patients. Those needing intensive care or specialised investigations and those likely to be a disruptive influence are excluded. No serious sexual problems have been encountered. 相似文献
20.
R W Carslaw 《BMJ (Clinical research ed.)》1975,2(5971):617-618