首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
In poplar, genetic research on wood properties is very important for the improvement of wood quality. Studies of wood formation genes at each developmental stage using modern biotechnology have often been limited to several genes or gene families. Because of the complex regulatory network involved in the co-expression and interactions of thousands of genes, however, the genetic mechanisms of wood formation must be surveyed on a genome-wide scale. In this study, we identified wood formation-related genes using a differentially co-expressed (DCE) gene subset approach based on biological networks inferred from microarray data. Gene co-expression networks in leaf, root, and wood tissues were first constructed and topologically analyzed using microarray data collected from the Gene Expression Omnibus. The DCE gene modules in wood-forming tissue were then detected based on graph theory, which was followed by gene ontology (GO) enrichment analysis and GO annotation of probe sets. Finally, 72 probe sets were identified in the largest cohesive subgroup of the DCE gene network in wood tissue, with most of the probe sets associated with wood formation-related biological processes and GO cellular component categories. The approach described in this paper provides an effective strategy to identify wood formation genes in poplar and should contribute to the better understanding of the genetic and molecular mechanisms underlying wood properties in trees.  相似文献   

3.
4.
5.
SUMMARY: The NetAffx Gene Ontology (GO) Mining Tool is a web-based, interactive tool that permits traversal of the GO graph in the context of microarray data. It accepts a list of Affymetrix probe sets and renders a GO graph as a heat map colored according to significance measurements. The rendered graph is interactive, with nodes linked to public web sites and to lists of the relevant probe sets. The GO Mining Tool provides visualization combining biological annotation with expression data, encompassing thousands of genes in one interactive view. AVAILABILITY: GO Mining Tool is freely available at http://www.affymetrix.com/analysis/query/go_analysis.affx  相似文献   

6.
Large amounts of gene expression data from several different technologies are becoming available to the scientific community. A common practice is to use these data to calculate global gene coexpression for validation or integration of other "omic" data. To assess the utility of publicly available datasets for this purpose we have analyzed Homo sapiens data from 1202 cDNA microarray experiments, 242 SAGE libraries, and 667 Affymetrix oligonucleotide microarray experiments. The three datasets compared demonstrate significant but low levels of global concordance (rc<0.11). Assessment against Gene Ontology (GO) revealed that all three platforms identify more coexpressed gene pairs with common biological processes than expected by chance. As the Pearson correlation for a gene pair increased it was more likely to be confirmed by GO. The Affymetrix dataset performed best individually with gene pairs of correlation 0.9-1.0 confirmed by GO in 74% of cases. However, in all cases, gene pairs confirmed by multiple platforms were more likely to be confirmed by GO. We show that combining results from different expression platforms increases reliability of coexpression. A comparison with other recently published coexpression studies found similar results in terms of performance against GO but with each method producing distinctly different gene pair lists.  相似文献   

7.
An understanding of heart development is critical in any systems biology approach to cardiovascular disease. The interpretation of data generated from high-throughput technologies (such as microarray and proteomics) is also essential to this approach. However, characterizing the role of genes in the processes underlying heart development and cardiovascular disease involves the non-trivial task of data analysis and integration of previous knowledge. The Gene Ontology (GO) Consortium provides structured controlled biological vocabularies that are used to summarize previous functional knowledge for gene products across all species. One aspect of GO describes biological processes, such as development and signaling.In order to support high-throughput cardiovascular research, we have initiated an effort to fully describe heart development in GO; expanding the number of GO terms describing heart development from 12 to over 280. This new ontology describes heart morphogenesis, the differentiation of specific cardiac cell types, and the involvement of signaling pathways in heart development. This work also aligns GO with the current views of the heart development research community and its representation in the literature. This extension of GO allows gene product annotators to comprehensively capture the genetic program leading to the developmental progression of the heart. This will enable users to integrate heart development data across species, resulting in the comprehensive retrieval of information about this subject.The revised GO structure, combined with gene product annotations, should improve the interpretation of data from high-throughput methods in a variety of cardiovascular research areas, including heart development, congenital cardiac disease, and cardiac stem cell research. Additionally, we invite the heart development community to contribute to the expansion of this important dataset for the benefit of future research in this area.  相似文献   

8.
We address the problem of using expression data and prior biological knowledge to identify differentially expressed pathways or groups of genes. Following an idea of Ideker et al. (2002), we construct a gene interaction network and search for high-scoring subnetworks. We make several improvements in terms of scoring functions and algorithms, resulting in higher speed and accuracy and easier biological interpretation. We also assign significance levels to our results, adjusted for multiple testing. Our methods are successfully applied to three human microarray data sets, related to cancer and the immune system, retrieving several known and potential pathways. The method, denoted by the acronym GXNA (Gene eXpression Network Analysis) is implemented in software that is publicly available and can be used on virtually any microarray data set. SUPPLEMENTARY INFORMATION: The source code and executable for the software, as well as certain supplemental materials, can be downloaded from http://stat.stanford.edu/~serban/gxna.  相似文献   

9.
10.
11.
Improving missing value estimation in microarray data with gene ontology   总被引:3,自引:0,他引:3  
MOTIVATION: Gene expression microarray experiments produce datasets with frequent missing expression values. Accurate estimation of missing values is an important prerequisite for efficient data analysis as many statistical and machine learning techniques either require a complete dataset or their results are significantly dependent on the quality of such estimates. A limitation of the existing estimation methods for microarray data is that they use no external information but the estimation is based solely on the expression data. We hypothesized that utilizing a priori information on functional similarities available from public databases facilitates the missing value estimation. RESULTS: We investigated whether semantic similarity originating from gene ontology (GO) annotations could improve the selection of relevant genes for missing value estimation. The relative contribution of each information source was automatically estimated from the data using an adaptive weight selection procedure. Our experimental results in yeast cDNA microarray datasets indicated that by considering GO information in the k-nearest neighbor algorithm we can enhance its performance considerably, especially when the number of experimental conditions is small and the percentage of missing values is high. The increase of performance was less evident with a more sophisticated estimation method. We conclude that even a small proportion of annotated genes can provide improvements in data quality significant for the eventual interpretation of the microarray experiments. AVAILABILITY: Java and Matlab codes are available on request from the authors. SUPPLEMENTARY MATERIAL: Available online at http://users.utu.fi/jotatu/GOImpute.html.  相似文献   

12.
MOTIVATION: The result of a typical microarray experiment is a long list of genes with corresponding expression measurements. This list is only the starting point for a meaningful biological interpretation. Modern methods identify relevant biological processes or functions from gene expression data by scoring the statistical significance of predefined functional gene groups, e.g. based on Gene Ontology (GO). We develop methods that increase the explanatory power of this approach by integrating knowledge about relationships between the GO terms into the calculation of the statistical significance. RESULTS: We present two novel algorithms that improve GO group scoring using the underlying GO graph topology. The algorithms are evaluated on real and simulated gene expression data. We show that both methods eliminate local dependencies between GO terms and point to relevant areas in the GO graph that remain undetected with state-of-the-art algorithms for scoring functional terms. A simulation study demonstrates that the new methods exhibit a higher level of detecting relevant biological terms than competing methods.  相似文献   

13.
Geometric interpretation of gene coexpression network analysis   总被引:1,自引:0,他引:1  
THE MERGING OF NETWORK THEORY AND MICROARRAY DATA ANALYSIS TECHNIQUES HAS SPAWNED A NEW FIELD: gene coexpression network analysis. While network methods are increasingly used in biology, the network vocabulary of computational biologists tends to be far more limited than that of, say, social network theorists. Here we review and propose several potentially useful network concepts. We take advantage of the relationship between network theory and the field of microarray data analysis to clarify the meaning of and the relationship among network concepts in gene coexpression networks. Network theory offers a wealth of intuitive concepts for describing the pairwise relationships among genes, which are depicted in cluster trees and heat maps. Conversely, microarray data analysis techniques (singular value decomposition, tests of differential expression) can also be used to address difficult problems in network theory. We describe conditions when a close relationship exists between network analysis and microarray data analysis techniques, and provide a rough dictionary for translating between the two fields. Using the angular interpretation of correlations, we provide a geometric interpretation of network theoretic concepts and derive unexpected relationships among them. We use the singular value decomposition of module expression data to characterize approximately factorizable gene coexpression networks, i.e., adjacency matrices that factor into node specific contributions. High and low level views of coexpression networks allow us to study the relationships among modules and among module genes, respectively. We characterize coexpression networks where hub genes are significant with respect to a microarray sample trait and show that the network concept of intramodular connectivity can be interpreted as a fuzzy measure of module membership. We illustrate our results using human, mouse, and yeast microarray gene expression data. The unification of coexpression network methods with traditional data mining methods can inform the application and development of systems biologic methods.  相似文献   

14.
An efficient two-step Markov blanket method for modeling and inferring complex regulatory networks from large-scale microarray data sets is presented. The inferred gene regulatory network (GRN) is based on the time series gene expression data capturing the underlying gene interactions. For constructing a highly accurate GRN, the proposed method performs: 1) discovery of a gene's Markov Blanket (MB), 2) formulation of a flexible measure to determine the network's quality, 3) efficient searching with the aid of a guided genetic algorithm, and 4) pruning to obtain a minimal set of correct interactions. Investigations are carried out using both synthetic as well as yeast cell cycle gene expression data sets. The realistic synthetic data sets validate the robustness of the method by varying topology, sample size, time delay, noise, vertex in-degree, and the presence of hidden nodes. It is shown that the proposed approach has excellent inferential capabilities and high accuracy even in the presence of noise. The gene network inferred from yeast cell cycle data is investigated for its biological relevance using well-known interactions, sequence analysis, motif patterns, and GO data. Further, novel interactions are predicted for the unknown genes of the network and their influence on other genes is also discussed.  相似文献   

15.
SUMMARY: The Gandr (gene annotation data representation) knowledgebase is an ontological framework for laboratory-specific gene annotation. Gandr uses Protege 2000 for editing, querying and visualizing microarray data and annotations. Genes can be annotated with provided, newly created or imported ontological concepts. Annotated genes can inherit assigned concept properties and can be related to each other. The resulting knowledgebase can be visualized as interactive network of nodes and edges representing genes and their functional relationships. This allows for immediate and associative gene context exploration. Ontological query techniques allow for powerful data access.  相似文献   

16.
MOTIVATION: The diverse microarray datasets that have become available over the past several years represent a rich opportunity and challenge for biological data mining. Many supervised and unsupervised methods have been developed for the analysis of individual microarray datasets. However, integrated analysis of multiple datasets can provide a broader insight into genetic regulation of specific biological pathways under a variety of conditions. RESULTS: To aid in the analysis of such large compendia of microarray experiments, we present Microarray Experiment Functional Integration Technology (MEFIT), a scalable Bayesian framework for predicting functional relationships from integrated microarray datasets. Furthermore, MEFIT predicts these functional relationships within the context of specific biological processes. All results are provided in the context of one or more specific biological functions, which can be provided by a biologist or drawn automatically from catalogs such as the Gene Ontology (GO). Using MEFIT, we integrated 40 Saccharomyces cerevisiae microarray datasets spanning 712 unique conditions. In tests based on 110 biological functions drawn from the GO biological process ontology, MEFIT provided a 5% or greater performance increase for 54 functions, with a 5% or more decrease in performance in only two functions.  相似文献   

17.
18.
19.
SUMMARY: Analysis of microarray data most often produces lists of genes with similar expression patterns, which are then subdivided into functional categories for biological interpretation. Such functional categorization is most commonly accomplished using Gene Ontology (GO) categories. Although there are several programs that identify and analyze functional categories for human, mouse and yeast genes, none of them accept Arabidopsis thaliana data. In order to address this need for A.thaliana community, we have developed a program that retrieves GO annotations for A.thaliana genes and performs functional category analysis for lists of genes selected by the user. AVAILABILITY: http://www.personal.psu.edu/nhs109/Clench  相似文献   

20.
Circadian rhythms are governed by a highly coupled, complex network of genes. Due to feedback within the network, any modification of the system's state requires coherent changes in several nodes. A model of the underlying network is necessary to compute these modifications. We use an effective modeling approach for this task. Rather than inferred biochemical interactions, our method utilizes microarray data from a group of mutants for its construction. With simulated data, we develop an effective model for a circadian network in a peripheral tissue, subject to driving by the suprachiasmatic nucleus, the mammalian pacemaker. The effective network can predict time-dependent gene expression levels in other mutants.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号