首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 694 毫秒
1.
The concept of fractal dimension is applied to protein surfaces. Satellite tobacco necrosis virus, prealhumin, retinol binding protein and lysozyme have been studied. A residue fractal index has been defined, which provides a suitable colour code when using computer graphics for visualizing surfaces. Some provisions are made that render the MS algorithm useful to calculate protein surface fractal dimensions. It has been found that a correlation exists between regions of high fractal dimension and those involved in protein-protein interactions. The usefulness of surface fractality in this context is demonstrated by a molecular docking experiment.  相似文献   

2.
We consider the problem of comparing several nucleic acid sequencesto identify words occurring imperfectly (patterns with no gap)with unusual frequency. Methods for computing, representing,and inspecting interactively the structure of such repeatingmotifs in nucleic acids and more generally any text are described.Multiple sequences are treated as one large concatenate. Ina preprocessing step, a lexical index is created to providerapid string matching for the enumeration of the words matchinga pattern. For given word features (word length, minimal frequency),a sequence profile is displayed. The profile can be inspectedinteractively with on-line algorithms. Applications to the identificationof regulatory elements in DNA regions involved in the controlof gene expression are presented. Our program (‘DNA-Lexemics’)runs on the Macintosh.  相似文献   

3.
This article aims at providing a new theoretical insight into the fundamental question of the origin of truncated fractals in biological systems. It is well known that fractal geometry is one of the characteristics of living organisms. However, contrary to mathematical fractals which are self-similar at all scales, the biological fractals are truncated, i.e. their self-similarity extends at most over a few orders of magnitude of separation. We show that nonlinear coupled oscillators, modeling one of the basic features of biological systems, may generate truncated fractals: a truncated fractal pattern for basin boundaries appears in a simple mathematical model of two coupled nonlinear oscillators with weak dissipation. This fractal pattern can be considered as a particular hidden fractal property. At the level of sufficiently fine precision technique the truncated fractality acts as a simple structure, leading to predictability, but at a lower level of precision it is effectively fractal, limiting the predictability of the long-term behavior of biological systems. We point out to the generic nature of our result.  相似文献   

4.
Protein sequences of the SWISS-PROT data bank were analysed by fractal techniques and harmonic analysis. In both cases, the results show the presence of self-affinity, a kind of self-similarity, in the sequences. Self-similarity is a sign of fractality and fractality is a consequence of a chaotic dynamical process. The evolution of the protein sequences is modelled as a dynamical system. The abundance of the fractal form in biology and creation of fractal forms as a result of “chaos” is already established. It may be noted that the word “chaos” here implies that most predictable processes can also become unpredictable under certain conditions, and that the most unpredictable processes are not as unpredictable as they are expected to be. In evolutionary dynamics, this allows scope for mutations and variations in otherwise predictable situations, potentially leading to increased diversity. Part of this work was presented at the National Symposium on Evolution of Life.  相似文献   

5.
Mathematical characterization of three-dimensional gene expression patterns   总被引:2,自引:0,他引:2  
MOTIVATION: The importance of a systematic methodology for the mathematical characterization of three-dimensional gene expression patterns in embryonic development. METHODS: By combining lacunarity and multiscale fractal dimension analyses with computer-based methods of three-dimensional reconstruction, it becomes possible to extract new information from in situ hybridization studies. Lacunarity and fractality are appropriate measures for the cloud-like gene activation signals in embryonic tissues. The newly introduced multiscale method provides a natural extension of the fractal dimension concept, being capable of characterizing the fractality of geometrical patterns in terms of spatial scale. This tool can be systematically applied to three-dimensional patterns of gene expression. RESULTS: Applications are illustrated using the three-dimensional expression patterns of the myogenic marker gene Myf5 in a series of differentiating somites of a mouse embryo.  相似文献   

6.
7.
In this study, we introduce an original distance definition for graphs, called the Markov-inverse-F measure (MiF). This measure enables the integration of classical graph theory indices with new knowledge pertaining to structural feature extraction from semantic networks. MiF improves the conventional Jaccard and/or Simpson indices, and reconciles both the geodesic information (random walk) and co-occurrence adjustment (degree balance and distribution). We measure the effectiveness of graph-based coefficients through the application of linguistic graph information for a neural activity recorded during conceptual processing in the human brain. Specifically, the MiF distance is computed between each of the nouns used in a previous neural experiment and each of the in-between words in a subgraph derived from the Edinburgh Word Association Thesaurus of English. From the MiF-based information matrix, a machine learning model can accurately obtain a scalar parameter that specifies the degree to which each voxel in (the MRI image of) the brain is activated by each word or each principal component of the intermediate semantic features. Furthermore, correlating the voxel information with the MiF-based principal components, a new computational neurolinguistics model with a network connectivity paradigm is created. This allows two dimensions of context space to be incorporated with both semantic and neural distributional representations.  相似文献   

8.
We derive the species-area relationship (SAR) expected from an assemblage of fractally distributed species. If species have truly fractal spatial distributions with different fractal dimensions, we show that the expected SAR is not the classical power-law function, as suggested recently in the literature. This analytically derived SAR has a distinctive shape that is not commonly observed in nature: upward-accelerating richness with increasing area (when plotted on log-log axes). This suggests that, in reality, most species depart from true fractal spatial structure. We demonstrate the fitting of a fractal SAR using two plant assemblages (Alaskan trees and British grasses). We show that in both cases, when modelled as fractal patterns, the modelled SAR departs from the observed SAR in the same way, in accord with the theory developed here. The challenge is to identify how species depart from fractality, either individually or within assemblages, and more importantly to suggest reasons why species distributions are not self-similar and what, if anything, this can tell us about the spatial processes involved in their generation.  相似文献   

9.
Because the volume of information available online is growing at breakneck speed, keeping up with meaning and information communicated by the media and netizens is a new challenge both for scholars and for companies who must address public relations crises. Most current theories and tools are directed at identifying one website or one piece of online news and do not attempt to develop a rapid understanding of all websites and all news covering one topic. This paper represents an effort to integrate statistics, word segmentation, complex networks and visualization to analyze headlines’ keywords and words relationships in online Chinese news using two samples: the 2011 Bohai Bay oil spill and the 2010 Gulf of Mexico oil spill. We gathered all the news headlines concerning the two trending events in the search results from Baidu, the most popular Chinese search engine. We used Simple Chinese Word Segmentation to segment all the headlines into words and then took words as nodes and considered adjacent relations as edges to construct word networks both using the whole sample and at the monthly level. Finally, we develop an integrated mechanism to analyze the features of words’ networks based on news headlines that can account for all the keywords in the news about a particular event and therefore track the evolution of news deeply and rapidly.  相似文献   

10.
11.
Automated extraction of information in molecular biology   总被引:3,自引:0,他引:3  
Andrade MA  Bork P 《FEBS letters》2000,476(1-2):12-17
We review data mining techniques in molecular biology, specifically those that extract information from the scientific literature itself. As more of the biological literature is published electronically, there is an opportunity, and even a need, to automatically summarize the literature in a customized way, for example by associating keywords to a topic. These keywords can be extracted from relevant publications. The process of keyword extraction can be automated and optimized to keep literature pointers automatically up-to-date or to filter relevant information from the literature. To illustrate these points, OMIM (Online Mendelian Inheritance in Man), a database of human inherited diseases, was linked to the literature and keywords were derived that covered distinct aspects such as genetic information on the one hand and disease-specific protein and phenotypic information on the other. They were used to extract information that is helpful for keeping entries about disease up-to-date.  相似文献   

12.
云贵鹅耳枥种群分布格局的分形特征   总被引:12,自引:2,他引:12  
应用分形理论中的计盒维数和信息维数探讨了贵阳喀斯特山地贵鹅耳枥种群分布格局的分形特征。结果表明,贵鹅耳枥种群的分布格局具有分形特征,其计盒维数为1.1853-1.7419,信息维数为1.1961-1.7051。集群型的贵鹅耳枥种群的计盒维数和信息维数均比随机型的高。计盒维数定量地反映了贵鹅耳枥种群占据生态空间的能力,信息维数则揭示了该种群格局强度的尺度变化程度和表征了种群个体分布的非均匀性。这两种维数方法都适用于贵鹅耳枥种群分布格局分形特征的定量描述。  相似文献   

13.
We have shown, in a previous paper, that tandem repeating sequences, especially triplet repeats, play a very important role in gene evolution. This result led to the formulation of the following hypothesis: most of the genomic sequences evolved through everlasting acts of tandem repeat expansions with subsequent accumulation of changes. In order to estimate how much of the observed sequences have the repeat origin we describe the adaptation of a text segmentation algorithm, based on dynamic programming, to the mapping of the ancient expansion events. The algorithm maximizes the segmentation cost, calculated as the similarity of obtained fragments to the putative repeat sequence. In the first application of the algorithm to segmentations of genomic sequences, a significant difference between the natural sequences and the corresponding shuffled sequences is detected. The natural fragments are longer and more similar to the putative repeat sequences. As our analysis shows, the coding sequences allow for repeats only when the size of the repeated words is divisible by three. In contrast, in the non-coding sequences, all repeated word sizes are present. It was estimated, that in Escherichia coli K12 genome, about 35.5% of sequence can be detectably traced to original simple repeat ancestors. The results shed light on the genomic sequence organization, and strongly confirm the hypothesis about the crucial role of triplet expansions in gene origin and evolution.  相似文献   

14.
We have developed Textpresso, a new text-mining system for scientific literature whose capabilities go far beyond those of a simple keyword search engine. Textpresso's two major elements are a collection of the full text of scientific articles split into individual sentences, and the implementation of categories of terms for which a database of articles and individual sentences can be searched. The categories are classes of biological concepts (e.g., gene, allele, cell or cell group, phenotype, etc.) and classes that relate two objects (e.g., association, regulation, etc.) or describe one (e.g., biological process, etc.). Together they form a catalog of types of objects and concepts called an ontology. After this ontology is populated with terms, the whole corpus of articles and abstracts is marked up to identify terms of these categories. The current ontology comprises 33 categories of terms. A search engine enables the user to search for one or a combination of these tags and/or keywords within a sentence or document, and as the ontology allows word meaning to be queried, it is possible to formulate semantic queries. Full text access increases recall of biological data types from 45% to 95%. Extraction of particular biological facts, such as gene-gene interactions, can be accelerated significantly by ontologies, with Textpresso automatically performing nearly as well as expert curators to identify sentences; in searches for two uniquely named genes and an interaction term, the ontology confers a 3-fold increase of search efficiency. Textpresso currently focuses on Caenorhabditis elegans literature, with 3,800 full text articles and 16,000 abstracts. The lexicon of the ontology contains 14,500 entries, each of which includes all versions of a specific word or phrase, and it includes all categories of the Gene Ontology database. Textpresso is a useful curation tool, as well as search engine for researchers, and can readily be extended to other organism-specific corpora of text. Textpresso can be accessed at http://www.textpresso.org or via WormBase at http://www.wormbase.org.  相似文献   

15.
The effects of spatial selective attention upon ERPs associated with the processing of word stimuli were investigated. While subjects maintained central eye fixation, ERPs were recorded to words presented to the left and right visual fields. In each of 6 runs, subjects focussed attention to alternate fields to perform a category-detection task. Pairs of semantically related and repeated words were embedded in the word lists presented to the attended and unattended visual fields. Consistent with prior studies, the P1-N1 visual ERP was larger when elicited by words in attended spatial locations. A large negative slow wave identified as N400 was elicited by attended, but not unattended, words. For attended words, N400 was smaller for semantically primed or repeated words. We concluded that spatial selective attention can modulate the degree to which words are processed, and that the cognitive processes associated with N400 are not automatic.  相似文献   

16.
黄京飞  刘次全 《动物学报》1992,38(3):334-338
本文根据分形理论的原理和方法,在对现行的计算核酸序列分维的方法进行修改的基础上,对各类生物的80余种5SrRNA序列的分维进行了计算,并结合耗散结构理论就其分维与分子进化的关系问题进行了研究和探讨。作者认为,5SrRNA序列的分维与其分子进化间的关系是一种复杂的非线性关系,在分子进化的过程中,序列的分维表现为随机涨落。  相似文献   

17.
利用多期遥感数据,提取营口南部海岸五期岸线变化信息、分维值,并利用Arcgis10.2中渔网工具创建评价单元。计算景观格局指数,构建人为干扰强度指标,并进一步探讨岸线变化和景观格局变化对人类活动的动态响应。结果表明:(1)研究区围垦导致岸线长度增长、岸线分形维数的增大,1990—2015年四个时段中岸线年增长速率为0.52%、0.53%、4.98%和0.96%;(2)景观格局指数反映2005年之前景观边界、形状复杂程度与破碎程度有所增加;2005年之后景观形状趋于规则化、土地利用趋于均衡化;(3)研究期间强干扰和弱干扰区域面积均有所增加,中等干扰强度区域面积减小;(4)斑块密度、总边缘长度、边缘密度、景观形状指数和平均分维值均与平均干扰强度指数变化同步。Shannon多样性指数与人为干扰强度变化呈反相关;(5)岸线长度变化和岸线分形维数变化都与人类干扰度呈反相关,相关性分别为-0.97和-0.98。  相似文献   

18.
Hierarchical organization is prevalent in networks representing a wide range of systems in nature and society. An important example is given by the tag hierarchies extracted from large on-line data repositories such as scientific publication archives, file sharing portals, blogs, on-line news portals, etc. The tagging of the stored objects with informative keywords in such repositories has become very common, and in most cases the tags on a given item are free words chosen by the authors independently. Therefore, the relations among keywords appearing in an on-line data repository are unknown in general. However, in most cases the topics and concepts described by these keywords are forming a latent hierarchy, with the more general topics and categories at the top, and more specialized ones at the bottom. There are several algorithms available for deducing this hierarchy from the statistical features of the keywords. In the present work we apply a recent, co-occurrence-based tag hierarchy extraction method to sets of keywords obtained from four different on-line news portals. The resulting hierarchies show substantial differences not just in the topics rendered as important (being at the top of the hierarchy) or of less interest (categorized low in the hierarchy), but also in the underlying network structure. This reveals discrepancies between the plausible keyword association frameworks in the studied news portals.  相似文献   

19.
We assayed the diurnal concentrations of growth hormone (GH) and prolactin (PRL) in 6 healthy male volunteers to evaluate the self-similar features in the time series of each hormone on the basis of fractal theory and to determine the fractal dimension as an index of the complexity of the diurnal variation. In addition, we assessed the effects of a 6-hour delay in the sleep period on the complexity of the diurnal variaton of these hormones. There was a statistically significant fractal feature in the serum levels of GH both under the nocturnal-sleep and delayed-sleep conditions in all subjects. The time series of the serum PRL concentrations also showed a statistically significant fractal feature under the nocturnal-sleep and delayed-sleep conditions in all subjects. The fractal dimensions of the patterns of the GH or PRL levels were 1.879 and 1.929 or 1.754 and 1.785 under the nocturnal-sleep and delayed-sleep conditions, respectively. Two-way ANOVA revealed no significant difference in the fractal dimension between the two sleep conditions but did reveal a significant difference between the fractal dimensions of the GH and PRL levels. These results showed (1) that delayed sleep had no significant effect on the complexity of the diurnal pattern of these hormones, and (2) that the diurnal pattern of the GH levels was more complex than that of the PRL levels.  相似文献   

20.
Gene function annotation remains a key challenge in modern biology. This is especially true for high-throughput techniques such as gene expression experiments. Vital information about genes is available electronically from biomedical literature in the form of full texts and abstracts. In addition, various publicly available databases (such as GenBank, Gene Ontology and Entrez) provide access to gene-related information at different levels of biological organization, granularity and data format. This information is being used to assess and interpret the results from high-throughput experiments. To improve keyword extraction for annotational clustering and other types of analyses, we have developed a novel text mining approach, which is based on keywords identified at the level of gene annotation sentences (in particular sentences characterizing biological function) instead of entire abstracts. Further, to improve the expressiveness and usefulness of gene annotation terms, we investigated the combination of sentence-level keywords with terms from the Medical Subject Headings (MeSH) and Gene Ontology (GO) resources. We find that sentence-level keywords combined with MeSH terms outperforms the typical 'baseline' set-up (term frequencies at the level of abstracts) by a significant margin, whereas the addition of GO terms improves matters only marginally. We validated our approach on the basis of a manually annotated corpus of 200 abstracts generated on the basis of 2 cancer categories and 10 genes per category. We applied the method in the context of three sets of differentially expressed genes obtained from pediatric brain tumor samples. This analysis suggests novel interpretations of discovered gene expression patterns.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号