共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
4.
Background
Existing clustering approaches for microarray data do not adequately differentiate between subsets of co-expressed genes. We devised a novel approach that integrates expression and sequence data in order to generate functionally coherent and biologically meaningful subclusters of genes. Specifically, the approach clusters co-expressed genes on the basis of similar content and distributions of predicted statistically significant sequence motifs in their upstream regions.Results
We applied our method to several sets of co-expressed genes and were able to define subsets with enrichment in particular biological processes and specific upstream regulatory motifs.Conclusions
These results show the potential of our technique for functional prediction and regulatory motif identification from microarray data.5.
Weinberger ED 《Bio Systems》2002,66(3):105-119
'Standard' information theory says nothing about the semantic content of information. Nevertheless, applications such as evolutionary theory demand consideration of precisely this aspect of information, a need that has motivated a largely unsuccessful search for a suitable measure of an 'amount of meaning'. This paper represents an attempt to move beyond this impasse, based on the observation that the meaning of a message can only be understood relative to its receiver. Positing that the semantic value of information is its usefulness in making an informed decision, we define pragmatic information as the information gain in the probability distributions of the receiver's actions, both before and after receipt of a message in some pre-defined ensemble. We then prove rigorously that our definition is the only one that satisfies obvious desiderata, such as the additivity of information from logically independent messages. This definition, when applied to the information 'learned' by the time evolution of a process, defies the intuitions of the few previous researchers thinking along these lines by being monotonic in the uncertainty that remains after receipt of the message, but non-monotonic in the Shannon entropy of the input ensemble. It also follows that the pragmatic information of the genetic 'messages' in an evolving population is a global Lyapunov function for Eigen's quasi-species model of biological evolution. A concluding section argues that a theory such as ours must explicitly acknowledge purposeful action, or 'agency', in such diverse fields as evolutionary theory and finance. 相似文献
6.
Creane A Maher E Sultan S Hynes N Kelly DJ Lally C 《Biomechanics and modeling in mechanobiology》2012,11(6):869-882
Many soft biological tissues contain collagen fibres, which act as major load bearing constituents. The orientation and the dispersion of these fibres influence the macroscopic mechanical properties of the tissue and are therefore of importance in several areas of research including constitutive model development, tissue engineering and mechanobiology. Qualitative comparisons between these fibre architectures can be made using vector plots of mean orientations and contour plots of fibre dispersion but quantitative comparison cannot be achieved using these methods. We propose a 'remodelling metric' between two angular fibre distributions, which represents the mean rotational effort required to transform one into the other. It is an adaptation of the earth mover's distance, a similarity measure between two histograms/signatures used in image analysis, which represents the minimal cost of transforming one distribution into the other by moving distribution mass around. In this paper, its utility is demonstrated by considering the change in fibre architecture during a period of plaque growth in finite element models of the carotid bifurcation. The fibre architecture is predicted using a strain-based remodelling algorithm. We investigate the remodelling metric's potential as a clinical indicator of plaque vulnerability by comparing results between symptomatic and asymptomatic carotid bifurcations. Fibre remodelling was found to occur at regions of plaque burden. As plaque thickness increased, so did the remodelling metric. A measure of the total predicted fibre remodelling during plaque growth, TRM, was found to be higher in the symptomatic group than in the asymptomatic group. Furthermore, a measure of the total fibre remodelling per plaque size, TRM/TPB, was found to be significantly higher in the symptomatic vessels. The remodelling metric may prove to be a useful tool in other soft tissues and engineered scaffolds where fibre adaptation is also present. 相似文献
7.
8.
9.
Dunn KW Kamocka MM McDonald JH 《American journal of physiology. Cell physiology》2011,300(4):C723-C742
Fluorescence microscopy is one of the most powerful tools for elucidating the cellular functions of proteins and other molecules. In many cases, the function of a molecule can be inferred from its association with specific intracellular compartments or molecular complexes, which is typically determined by comparing the distribution of a fluorescently labeled version of the molecule with that of a second, complementarily labeled probe. Although arguably the most common application of fluorescence microscopy in biomedical research, studies evaluating the "colocalization" of two probes are seldom quantified, despite a diversity of image analysis tools that have been specifically developed for that purpose. Here we provide a guide to analyzing colocalization in cell biology studies, emphasizing practical application of quantitative tools that are now widely available in commercial and free image analysis software. 相似文献
10.
Background
Current technologies have lead to the availability of multiple genomic data types in sufficient quantity and quality to serve as a basis for automatic global network inference. Accordingly, there are currently a large variety of network inference methods that learn regulatory networks to varying degrees of detail. These methods have different strengths and weaknesses and thus can be complementary. However, combining different methods in a mutually reinforcing manner remains a challenge.Methodology
We investigate how three scalable methods can be combined into a useful network inference pipeline. The first is a novel t-test–based method that relies on a comprehensive steady-state knock-out dataset to rank regulatory interactions. The remaining two are previously published mutual information and ordinary differential equation based methods (tlCLR and Inferelator 1.0, respectively) that use both time-series and steady-state data to rank regulatory interactions; the latter has the added advantage of also inferring dynamic models of gene regulation which can be used to predict the system''s response to new perturbations.Conclusion/Significance
Our t-test based method proved powerful at ranking regulatory interactions, tying for first out of methods in the DREAM4 100-gene in-silico network inference challenge. We demonstrate complementarity between this method and the two methods that take advantage of time-series data by combining the three into a pipeline whose ability to rank regulatory interactions is markedly improved compared to either method alone. Moreover, the pipeline is able to accurately predict the response of the system to new conditions (in this case new double knock-out genetic perturbations). Our evaluation of the performance of multiple methods for network inference suggests avenues for future methods development and provides simple considerations for genomic experimental design. Our code is publicly available at http://err.bio.nyu.edu/inferelator/. 相似文献11.
A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters 总被引:9,自引:1,他引:9 下载免费PDF全文
The availability of computerized knowledge on biochemical pathways in the KEGG database opens new opportunities for developing computational methods to characterize and understand higher level functions of complete genomes. Our approach is based on the concept of graphs; for example, the genome is a graph with genes as nodes and the pathway is another graph with gene products as nodes. We have developed a simple method for graph comparison to identify local similarities, termed correlated clusters, between two graphs, which allows gaps and mismatches of nodes and edges and is especially suitable for detecting biological features. The method was applied to a comparison of the complete genomes of 10 microorganisms and the KEGG metabolic pathways, which revealed, not surprisingly, a tendency for formation of correlated clusters called FRECs (functionally related enzyme clusters). However, this tendency varied considerably depending on the organism. The relative number of enzymes in FRECs was close to 50% for Bacillus subtilis and Escherichia coli, but was <10% for Synechocystis and Saccharomyces cerevisiae. The FRECs collection is reorganized into a collection of ortholog group tables in KEGG, which represents conserved pathway motifs with the information about gene clusters in all the completely sequenced genomes. 相似文献
12.
Guoping Tang 《Global Ecology and Biogeography》2008,17(4):465-478
Aim To present a new metric, the 'opposite and identity' (OI) index, for evaluating the correspondence between two sets of simulated time-series dynamics of an ecological variable.
Innovation The OI index is introduced and its mathematical expression is defined using vectors to denote simulated variations of an ecological variable on the basis of the vector addition rule. The value of the OI index varies from 0 to 1 with a value 0 (or 1) indicating that compared simulations are opposite (or identical). An OI index with a value near 0.5 suggests that the difference in the amplitudes of variations between compared simulations is large. The OI index can be calculated in a grid cell, for a given biome and for time-series simulations. The OI indices calculated in each grid cell can be used to map the spatial agreement between compared simulations, allowing researchers to pinpoint the extent of agreement or disagreement between two simulations. The OI indices calculated for time-series simulations allow researchers to identify the time at which one simulation differs from another. A case study demonstrates the application and reliability of the OI index for comparing two simulated time-series dynamics of terrestrial net primary productivity in Asia from 1982 to 2000. In the case study, the OI index performs better than the correlation coefficient at accurately quantifying the agreement between two simulated time-series dynamics of terrestrial net primary productivity in Asia.
Main conclusions The OI index provides researchers with a useful tool and multiple flexible ways to compare two simulation results or to evaluate simulation results against observed spatiotemporal data. The OI index can, in some cases, quantify the agreement between compared spatiotemporal data more accurately than the correlation coefficient because of its insensitivity to influential data and outliers and the autocorrelation of simulated spatiotemporal data. 相似文献
Innovation The OI index is introduced and its mathematical expression is defined using vectors to denote simulated variations of an ecological variable on the basis of the vector addition rule. The value of the OI index varies from 0 to 1 with a value 0 (or 1) indicating that compared simulations are opposite (or identical). An OI index with a value near 0.5 suggests that the difference in the amplitudes of variations between compared simulations is large. The OI index can be calculated in a grid cell, for a given biome and for time-series simulations. The OI indices calculated in each grid cell can be used to map the spatial agreement between compared simulations, allowing researchers to pinpoint the extent of agreement or disagreement between two simulations. The OI indices calculated for time-series simulations allow researchers to identify the time at which one simulation differs from another. A case study demonstrates the application and reliability of the OI index for comparing two simulated time-series dynamics of terrestrial net primary productivity in Asia from 1982 to 2000. In the case study, the OI index performs better than the correlation coefficient at accurately quantifying the agreement between two simulated time-series dynamics of terrestrial net primary productivity in Asia.
Main conclusions The OI index provides researchers with a useful tool and multiple flexible ways to compare two simulation results or to evaluate simulation results against observed spatiotemporal data. The OI index can, in some cases, quantify the agreement between compared spatiotemporal data more accurately than the correlation coefficient because of its insensitivity to influential data and outliers and the autocorrelation of simulated spatiotemporal data. 相似文献
13.
基于投影寻踪法的武汉市“两型社会”评价模型与实证研究 总被引:2,自引:0,他引:2
为克服现有“两型社会”评价中存在的主观性强、不易处理高维数据等缺陷,提出了基于投影寻踪法的武汉市“两型社会”评价新方法。从资源、环境、经济、社会四个子系统出发,构建了武汉市"两型社会"评价指标体系;选取武汉市2000-2009年数据作为样本,将多维评价指标值投影为一维投影数据;引入加速遗传算法,优化投影指标函数寻求最佳投影方向;根据投影值大小对武汉市2000-2009年“两型社会”发展状况进行比较,利用最佳投影方向信息研究了各个指标对武汉市”两型社会“发展的影响程度。研究结果表明,2000-2009年,武汉市“两型社会”建设呈现出良好的发展态势,建成区绿化覆盖率、人均公共绿地面积、第三产业增加值占GDP比重、空气质量优良率、每千人口医院床位数、工业用水重复利用率、每万人在校大学生人数、单位GDP能耗、每万人拥有公共交通车辆等指标是武汉市“两型社会”建设的重要驱动因素。对此就武汉市建设“两型社会”提出了对策和建议。最后,指出投影寻踪模型为城市"两型社会"发展综合评价提供了一种值得探索和尝试的新方法。 相似文献
14.
Samrat Mondol Navya R Vidya Athreya Kartik Sunagar Velu Mani Selvaraj Uma Ramakrishnan 《BMC genetics》2009,10(1):1-7
Background
Copy number variants (CNVs) have been identified in several studies to be associated with complex diseases. It is important, therefore, to understand the distribution of CNVs within and among populations. This study is the first report of a CNV map in African Americans.Results
Employing a SNP platform with greater than 500,000 SNPs, a first-generation CNV map of the African American genome was generated using DNA from 385 healthy African American individuals, and compared to a sample of 435 healthy White individuals. A total of 1362 CNVs were identified within African Americans, which included two CNV regions that were significantly different in frequency between African Americans and Whites (17q21 and 15q11). In addition, a duplication was identified in 74% of DNAs derived from cell lines that was not present in any of the whole blood derived DNAs.Conclusion
The Affymetrix 500 K array provides reliable CNV mapping information. However, using cell lines as a source of DNA may introduce artifacts. The duplication identified in high frequency in Whites and low frequency in African Americans on chromosome 17q21 reflects haplotype specific frequency differences between ancestral groups. The generation of the CNV map will be a valuable tool for identifying disease associated CNVs in African Americans. 相似文献15.
16.
It is well-known that functionally related genes occur in a physically clustered form, especially operons in bacteria. By leveraging on this fact, there has recently been an interesting problem formulation known as gene team model, which searches for a set of genes that co-occur in a pair of closely related genomes. However, many gene teams, even experimentally verified operons, frequently scatter within other genomes. Thus, the gene team model should be refined to reflect this observation. In this paper, we generalized the gene team model, that looks for gene clusters in a physically clustered form, to multiple genome cases with relaxed constraints. We propose a novel hybrid pattern model that combines the set and the sequential pattern models. Our model searches for gene clusters with and/or without physical proximity constraint. This model is implemented and tested with 97 genomes (120 replicons). The result was analyzed to show the usefulness of our model. We also compared the result from our hybrid model to those from the traditional gene team model. We also show that predicted gene teams can be used for various genome analysis: operon prediction, phylogenetic analysis of organisms, contextual sequence analysis and genome annotation. Our program is fast enough to provide a service on the web at http://platcom.informatics.indiana.edu/platcom/. Users can select any combination of 97 genomes to predict gene teams. 相似文献
17.
Background
The development of high-throughput laboratory techniques created a demand for computer-assisted result analysis tools. Many of these techniques return lists of genes whose interpretation requires finding relevant biological roles for the problem at hand. The required information is typically available in public databases, and usually, this information must be manually retrieved to complement the analysis. This process is a very time-consuming task that should be automated as much as possible. 相似文献18.
19.