共查询到20条相似文献,搜索用时 15 毫秒
1.
Biochemical systems analysis of genome-wide expression data 总被引:6,自引:0,他引:6
MOTIVATION: Modern methods of genomics have produced an unprecedented amount of raw data. The interpretation and explanation of these data constitute a major, well-recognized challenge. RESULTS: Biochemical Systems Theory (BST) is the mathematical basis of a well-established methodological framework for analyzing networks of biochemical reactions. An existing BST model of yeast glycolysis is used here to explain and interpret the glycolytic gene expression pattern of heat shocked yeast. Our analysis demonstrates that the observed gene expression profile satisfies the primary goals of increased ATP, trehalose, and NADPH production, while maintaining intermediate metabolites at reasonable levels. Based on a systematic exploration of alternative, hypothetical expression profiles, we show that the observed profile outperforms other profiles. Conclusion: BST is a useful framework for combining DNA microarray data with enzymatic process information to yield new insights into metabolic pathway regulation. AVAILABILITY: All analyses were executed with the software PLAS(Copyright), which is freely available at http://correio.cc.fc.ul.pt/~aenf/plas.html for academic use. CONTACT: VoitEO@MUSC.edu 相似文献
2.
Background
Microarray gene expression data are accumulating in public databases. The expression profiles contain valuable information for understanding human gene expression patterns. However, the effective use of public microarray data requires integrating the expression profiles from heterogeneous sources.Results
In this study, we have compiled a compendium of microarray expression profiles of various human tissue samples. The microarray raw data generated in different research laboratories have been obtained and combined into a single dataset after data normalization and transformation. To demonstrate the usefulness of the integrated microarray data for studying human gene expression patterns, we have analyzed the dataset to identify potential tissue-selective genes. A new method has been proposed for genome-wide identification of tissue-selective gene targets using both microarray intensity values and detection calls. The candidate genes for brain, liver and testis-selective expression have been examined, and the results suggest that our approach can select some interesting gene targets for further experimental studies.Conclusion
A computational approach has been developed in this study for combining microarray expression profiles from heterogeneous sources. The integrated microarray data can be used to investigate tissue-selective expression patterns of human genes.3.
4.
5.
6.
7.
8.
9.
Alexander V Alekseyenko Nikita I Lytkin Jizhou Ai Bo Ding Leonid Padyukov Constantin F Aliferis Alexander Statnikov 《Biology direct》2011,6(1):1-13
Background
GWAS owe their popularity to the expectation that they will make a major impact on diagnosis, prognosis and management of disease by uncovering genetics underlying clinical phenotypes. The dominant paradigm in GWAS data analysis so far consists of extensive reliance on methods that emphasize contribution of individual SNPs to statistical association with phenotypes. Multivariate methods, however, can extract more information by considering associations of multiple SNPs simultaneously. Recent advances in other genomics domains pinpoint multivariate causal graph-based inference as a promising principled analysis framework for high-throughput data. Designed to discover biomarkers in the local causal pathway of the phenotype, these methods lead to accurate and highly parsimonious multivariate predictive models. In this paper, we investigate the applicability of causal graph-based method TIE* to analysis of GWAS data. To test the utility of TIE*, we focus on anti-CCP positive rheumatoid arthritis (RA) GWAS datasets, where there is a general consensus in the community about the major genetic determinants of the disease.Results
Application of TIE* to the North American Rheumatoid Arthritis Cohort (NARAC) GWAS data results in six SNPs, mostly from the MHC locus. Using these SNPs we develop two predictive models that can classify cases and disease-free controls with an accuracy of 0.81 area under the ROC curve, as verified in independent testing data from the same cohort. The predictive performance of these models generalizes reasonably well to Swedish subjects from the closely related but not identical Epidemiological Investigation of Rheumatoid Arthritis (EIRA) cohort with 0.71-0.78 area under the ROC curve. Moreover, the SNPs identified by the TIE* method render many other previously known SNP associations conditionally independent of the phenotype.Conclusions
Our experiments demonstrate that application of TIE* captures maximum amount of genetic information about RA in the data and recapitulates the major consensus findings about the genetic factors of this disease. In addition, TIE* yields reproducible markers and signatures of RA. This suggests that principled multivariate causal and predictive framework for GWAS analysis empowers the community with a new tool for high-quality and more efficient discovery.Reviewers
This article was reviewed by Prof. Anthony Almudevar, Dr. Eugene V. Koonin, and Prof. Marianthi Markatou. 相似文献10.
Christopher James Langmead Anthony K Yan C Robertson McClung Bruce Randall Donald 《Journal of computational biology》2003,10(3-4):521-536
We introduce a model-based analysis technique for extracting and characterizing rhythmic expression profiles from genome-wide DNA microarray hybridization data. These patterns are clues to discovering rhythmic genes implicated in cell-cycle, circadian, or other biological processes. The algorithm, implemented in a program called RAGE (Rhythmic Analysis of Gene Expression), decouples the problems of estimating a pattern's wavelength and phase. Our algorithm is linear-time in frequency and phase resolution, an improvement over previous quadratic-time approaches. Unlike previous approaches, RAGE uses a true distance metric for measuring expression profile similarity, based on the Hausdorff distance. This results in better clustering of expression profiles for rhythmic analysis. The confidence of each frequency estimate is computed using Z-scores. We demonstrate that RAGE is superior to other techniques on synthetic and actual DNA microarray hybridization data. We also show how to replace the discretized phase search in our method with an exact (combinatorially precise) phase search, resulting in a faster algorithm with no complexity dependence on phase resolution. 相似文献
11.
Gene-expression profiling of endothelial cells infected with Kaposi's sarcoma-associated herpesvirus has led to a greater understanding of the histogenesis of Kaposi's sarcoma and cellular reprogramming events that occur as a result of viral infection and that may play important roles in viral pathogenesis. 相似文献
12.
Background
Over the last few years, genome-wide association (GWA) studies became a tool of choice for the identification of loci associated with complex traits. Currently, imputed single nucleotide polymorphisms (SNP) data are frequently used in GWA analyzes. Correct analysis of imputed data calls for the implementation of specific methods which take genotype imputation uncertainty into account. 相似文献13.
Fan JB Chen J April CS Fisher JS Klotzle B Bibikova M Kaper F Ronaghi M Linnarsson S Ota T Chien J Laurent LC Loring JF Nisperos SV Chen GY Zhong JF 《PloS one》2012,7(2):e30794
Background
We have developed a high-throughput amplification method for generating robust gene expression profiles using single cell or low RNA inputs.Methodology/Principal Findings
The method uses tagged priming and template-switching, resulting in the incorporation of universal PCR priming sites at both ends of the synthesized cDNA for global PCR amplification. Coupled with a whole-genome gene expression microarray platform, we routinely obtain expression correlation values of R2∼0.76–0.80 between individual cells and R2∼0.69 between 50 pg total RNA replicates. Expression profiles generated from single cells or 50 pg total RNA correlate well with that generated with higher input (1 ng total RNA) (R2∼0.80). Also, the assay is sufficiently sensitive to detect, in a single cell, approximately 63% of the number of genes detected with 1 ng input, with approximately 97% of the genes detected in the single-cell input also detected in the higher input.Conclusions/Significance
In summary, our method facilitates whole-genome gene expression profiling in contexts where starting material is extremely limiting, particularly in areas such as the study of progenitor cells in early development and tumor stem cell biology. 相似文献14.
We describe a PCA-based genome scan approach to analyze genome-wide admixture structure, and introduce wavelet transform analysis
as a method for estimating the time of admixture. We test the wavelet transform method with simulations and apply it to genome-wide
SNP data from eight admixed human populations. The wavelet transform method offers better resolution than existing methods
for dating admixture, and can be applied to either SNP or sequence data from humans or other species. 相似文献
15.
Background
Gene expression is a two-step synthesis process that ends with the necessary amount of each protein required to perform its function. Since the protein is the final product, the main focus of gene regulation should be centered on it. However, because mRNA is an intermediate step and the amounts of both mRNA and protein are controlled by their synthesis and degradation rates, the desired amount of protein can be achieved following different strategies. 相似文献16.
A multivariate approach for integrating genome-wide expression data and biological knowledge 总被引:1,自引:0,他引:1
MOTIVATION: Several statistical methods that combine analysis of differential gene expression with biological knowledge databases have been proposed for a more rapid interpretation of expression data. However, most such methods are based on a series of univariate statistical tests and do not properly account for the complex structure of gene interactions. RESULTS: We present a simple yet effective multivariate statistical procedure for assessing the correlation between a subspace defined by a group of genes and a binary phenotype. A subspace is deemed significant if the samples corresponding to different phenotypes are well separated in that subspace. The separation is measured using Hotelling's T(2) statistic, which captures the covariance structure of the subspace. When the dimension of the subspace is larger than that of the sample space, we project the original data to a smaller orthonormal subspace. We use this method to search through functional pathway subspaces defined by Reactome, KEGG, BioCarta and Gene Ontology. To demonstrate its performance, we apply this method to the data from two published studies, and visualize the results in the principal component space. 相似文献
17.
18.
19.
When completed this year, the Arabidopsis genome will represent the first plant genome to be fully sequenced. This sequence information, together with the large collection of expressed sequence tags, has established the basics for new approaches to studying gene expression patterns in plants on a global scale. We can now look at biology from the perspective of the whole genome. This revolution in the study of how all genes in an organism respond to certain stimuli has encouraged us to think in new dimensions. Expression profiles can be determined over a range of experimental conditions and organized into patterns that are diagnostic for the biological state of the cell. The field of genome-wide expression in plants has yet to produce its fruit; however, the current application of microarrays in yeast and human research foreshadows the diverse applications this technology could have in plant biology and agriculture. 相似文献