期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Global gene expression analyses of bystander and alpha particle irradiated normal human lung fibroblasts: Synchronous and differential responses

Shanaz A Ghandhi Benjamin Yaghoubian Sally A Amundson 《BMC medical genomics》2008,1(1):1-14

Background

Numerous gene lists or "classifiers" have been derived from global gene expression data that assign breast cancers to good and poor prognosis groups. A remarkable feature of these molecular signatures is that they have few genes in common, prompting speculation that they may use distinct genes to measure the same pathophysiological process(es), such as proliferation. However, this supposition has not been rigorously tested. If gene-based classifiers function by measuring a minimal number of cellular processes, we hypothesized that the informative genes for these processes could be identified and the data sets could be adjusted for the predictive contributions of those genes. Such adjustment would then attenuate the predictive function of any signature measuring that same process.

Results

We tested this hypothesis directly using a novel iterative-subtractive approach. We evaluated five gene expression data sets that sample a broad range of breast cancer subtypes. In all data sets, the dominant cluster capable of predicting metastasis was heavily populated by genes that fluctuate in concert with the cell cycle. When six well-characterized classifiers were examined, all contained a higher than expected proportion of genes that correlate with this cluster. Furthermore, when the data sets were globally adjusted for the cell cycle cluster, each classifier lost its ability to assign tumors to appropriate high and low risk groups. In contrast, adjusting for other predictive gene clusters did not impact their performance.

Conclusion

These data indicate that the discriminative ability of breast cancer classifiers is dependent upon genes that correlate with cell cycle progression. 相似文献

2.

MarVis: a tool for clustering and visualization of metabolic biomarkers

Alexander Kaever Thomas Lingner Kirstin Feussner Cornelia Göbel Ivo Feussner Peter Meinicke 《BMC bioinformatics》2009,10(1):1-8

Background

Gene set analysis based on Gene Ontology (GO) can be a promising method for the analysis of differential expression patterns. However, current studies that focus on individual GO terms have limited analytical power, because the complex structure of GO introduces strong dependencies among the terms, and some genes that are annotated to a GO term cannot be found by statistically significant enrichment.

Results

We proposed a method for enriching clustered GO terms based on semantic similarity, namely cluster enrichment analysis based on GO (CeaGO), to extend the individual term analysis method. Using an Affymetrix HGU95aV2 chip dataset with simulated gene sets, we illustrated that CeaGO was sensitive enough to detect moderate expression changes. When compared to parent-based individual term analysis methods, the results showed that CeaGO may provide more accurate differentiation of gene expression results. When used with two acute leukemia (ALL and ALL/AML) microarray expression datasets, CeaGO correctly identified specifically enriched GO groups that were overlooked by other individual test methods.

Conclusion

By applying CeaGO to both simulated and real microarray data, we showed that this approach could enhance the interpretation of microarray experiments. CeaGO is currently available at http://chgc.sh.cn/en/software/CeaGO/. 相似文献

3.

Supervised harvesting of expression trees 总被引：2，自引：2，他引：0

下载免费PDF全文

Hastie T Tibshirani R Botstein D Brown P 《Genome biology》2001,2(1):research0003.1-research000312

Background

We propose a new method for supervised learning from gene expression data. We call it 'tree harvesting'. This technique starts with a hierarchical clustering of genes, then models the outcome variable as a sum of the average expression profiles of chosen clusters and their products. It can be applied to many different kinds of outcome measures such as censored survival times, or a response falling in two or more classes (for example, cancer classes). The method can discover genes that have strong effects on their own, and genes that interact with other genes.

Results

We illustrate the method on data from a lymphoma study, and on a dataset containing samples from eight different cancers. It identified some potentially interesting gene clusters. In simulation studies we found that the procedure may require a large number of experimental samples to successfully discover interactions.

Conclusions

Tree harvesting is a potentially useful tool for exploration of gene expression data and identification of interesting clusters of genes worthy of further investigation. 相似文献

4.

Microarray data integration for genome-wide analysis of human tissue-selective gene expression

Wang L Srivastava AK Schwartz CE 《BMC genomics》2010,11(Z2):S15

Background

Microarray gene expression data are accumulating in public databases. The expression profiles contain valuable information for understanding human gene expression patterns. However, the effective use of public microarray data requires integrating the expression profiles from heterogeneous sources.

Results

In this study, we have compiled a compendium of microarray expression profiles of various human tissue samples. The microarray raw data generated in different research laboratories have been obtained and combined into a single dataset after data normalization and transformation. To demonstrate the usefulness of the integrated microarray data for studying human gene expression patterns, we have analyzed the dataset to identify potential tissue-selective genes. A new method has been proposed for genome-wide identification of tissue-selective gene targets using both microarray intensity values and detection calls. The candidate genes for brain, liver and testis-selective expression have been examined, and the results suggest that our approach can select some interesting gene targets for further experimental studies.

Conclusion

A computational approach has been developed in this study for combining microarray expression profiles from heterogeneous sources. The integrated microarray data can be used to investigate tissue-selective expression patterns of human genes.

相似文献

5.

Fourmidable: a database for ant genomics 总被引：1，自引：0，他引：1

Yannick Wurm Paolo Uva Frédéric Ricci John Wang Stephanie Jemielity Christian Iseli Laurent Falquet Laurent Keller 《BMC genomics》2009,10(1):1-5

相似文献

6.

A robust approach based on Weibull distribution for clustering gene expression data

Wang H Wang Z Li X Gong B Feng L Zhou Y 《Algorithms for molecular biology : AMB》2011,6(1):14-9

Background

Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest.

Results

In this paper, we proposed the WDCM (Weibull Distribution-based Clustering Method), a robust approach for clustering gene expression data, in which the gene expressions of individual genes are considered as the random variables following unique Weibull distributions. Our WDCM is based on the concept that the genes with similar expression profiles have similar distribution parameters, and thus the genes are clustered via the Weibull distribution parameters. We used the WDCM to cluster three cancer gene expression data sets from the lung cancer, B-cell follicular lymphoma and bladder carcinoma and obtained well-clustered results. We compared the performance of WDCM with k-means and Self Organizing Map (SOM) using functional annotation information given by the Gene Ontology (GO). The results showed that the functional annotation ratios of WDCM are higher than those of the other methods. We also utilized the external measure Adjusted Rand Index to validate the performance of the WDCM. The comparative results demonstrate that the WDCM provides the better clustering performance compared to k-means and SOM algorithms. The merit of the proposed WDCM is that it can be applied to cluster incomplete gene expression data without imputing the missing values. Moreover, the robustness of WDCM is also evaluated on the incomplete data sets.

Conclusions

The results demonstrate that our WDCM produces clusters with more consistent functional annotations than the other methods. The WDCM is also verified to be robust and is capable of clustering gene expression data containing a small quantity of missing values. 相似文献

7.

Meta-analysis of muscle transcriptome data using the MADMuscle database reveals biologically relevant gene patterns

Daniel Baron Emeric Dubois Audrey Bihouée Raluca Teusan Marja Steenman Philippe Jourdon Armelle Magot Yann Péréon Reiner Veitia Frédérique Savagner Gérard Ramstein Rémi Houlgatte 《BMC genomics》2011,12(1):113

相似文献

8.

Accurate molecular classification of cancer using simple rules 总被引：1，自引：0，他引：1

Xiaosheng Wang Osamu Gotoh 《BMC medical genomics》2009,2(1):1-23

Background

One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible.

Methods

We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV) of training sets and classification of independent test sets.

Results

We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML]), lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML). Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods.

Conclusion

In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction. 相似文献

9.

Model-based cluster analysis of microarray gene-expression data 总被引：3，自引：0，他引：3

Pan W Lin J Le CT 《Genome biology》2002,3(2):research0009.1-research00098

Background

Microarray technologies are emerging as a promising tool for genomic studies. The challenge now is how to analyze the resulting large amounts of data. Clustering techniques have been widely applied in analyzing microarray gene-expression data. However, normal mixture model-based cluster analysis has not been widely used for such data, although it has a solid probabilistic foundation. Here, we introduce and illustrate its use in detecting differentially expressed genes. In particular, we do not cluster gene-expression patterns but a summary statistic, the t-statistic.

Results

The method is applied to a data set containing expression levels of 1,176 genes of rats with and without pneumococcal middle-ear infection. Three clusters were found, two of which contain more than 95% genes with almost no altered gene-expression levels, whereas the third one has 30 genes with more or less differential gene-expression levels.

Conclusions

Our results indicate that model-based clustering of t-statistics (and possibly other summary statistics) can be a useful statistical tool to exploit differential gene expression for microarray data. 相似文献

10.

Computable visually observed phenotype ontological framework for plants

Jaturon Harnsomburana Jason M Green Adrian S Barb Mary Schaeffer Leszek Vincent Chi-Ren Shyu 《BMC bioinformatics》2011,12(1):1-21

Background

Gene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes.

Results

This paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2), which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations.

Conclusions

A novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation. 相似文献

11.

Within the fold: assessing differential expression measures and reproducibility in microarray assays 总被引：3，自引：0，他引：3

下载免费PDF全文

Yang IV Chen E Hasseman JP Liang W Frank BC Wang S Sharov V Saeed AI White J Li J Lee NH Yeatman TJ Quackenbush J 《Genome biology》2002,3(11):research0062.1-research006212

相似文献

12.

DeBi: Discovering Differentially Expressed Biclusters using a Frequent Itemset Approach

Serin A Vingron M 《Algorithms for molecular biology : AMB》2011,6(1):18-12

相似文献

13.

Generating confidence intervals on biological networks

Thomas Thorne Michael PH Stumpf 《BMC bioinformatics》2007,8(1):1-10

相似文献

14.

Identifying functional relationships within sets of co-expressed genes by combining upstream regulatory motif analysis and gene expression information

Martyanov V Gross RH 《BMC genomics》2010,11(Z2):S8

Background

Existing clustering approaches for microarray data do not adequately differentiate between subsets of co-expressed genes. We devised a novel approach that integrates expression and sequence data in order to generate functionally coherent and biologically meaningful subclusters of genes. Specifically, the approach clusters co-expressed genes on the basis of similar content and distributions of predicted statistically significant sequence motifs in their upstream regions.

Results

We applied our method to several sets of co-expressed genes and were able to define subsets with enrichment in particular biological processes and specific upstream regulatory motifs.

Conclusions

These results show the potential of our technique for functional prediction and regulatory motif identification from microarray data.

相似文献

15.

The NeuARt II system: a viewing tool for neuroanatomical data based on published neuroanatomical atlases

Gully APC Burns Wei-Cheng Cheng Richard H Thompson Larry W Swanson 《BMC bioinformatics》2006,7(1):1-19

Background

Activation of naïve B lymphocytes by extracellular ligands, e.g. antigen, lipopolysaccharide (LPS) and CD40 ligand, induces a combination of common and ligand-specific phenotypic changes through complex signal transduction pathways. For example, although all three of these ligands induce proliferation, only stimulation through the B cell antigen receptor (BCR) induces apoptosis in resting splenic B cells. In order to define the common and unique biological responses to ligand stimulation, we compared the gene expression changes induced in normal primary B cells by a panel of ligands using cDNA microarrays and a statistical approach, CLASSIFI (Cluster Assignment for Biological Inference), which identifies significant co-clustering of genes with similar Gene Ontology? annotation.

Results

CLASSIFI analysis revealed an overrepresentation of genes involved in ion and vesicle transport, including multiple components of the proton pump, in the BCR-specific gene cluster, suggesting that activation of antigen processing and presentation pathways is a major biological response to antigen receptor stimulation. Proton pump components that were not included in the initial microarray data set were also upregulated in response to BCR stimulation in follow up experiments. MHC Class II expression was found to be maintained specifically in response to BCR stimulation. Furthermore, ligand-specific internalization of the BCR, a first step in B cell antigen processing and presentation, was demonstrated.

Conclusion

These observations provide experimental validation of the computational approach implemented in CLASSIFI, demonstrating that CLASSIFI-based gene expression cluster analysis is an effective data mining tool to identify biological processes that correlate with the experimental conditional variables. Furthermore, this analysis has identified at least thirty-eight candidate components of the B cell antigen processing and presentation pathway and sets the stage for future studies focused on a better understanding of the components involved in and unique to B cell antigen processing and presentation. 相似文献

16.

Rapid bursts of androgen-binding protein (Abp) gene duplication occurred independently in diverse mammals

Christina M Laukaitis Andreas Heger Tyler D Blakley Pavel Munclinger Chris P Ponting Robert C Karn 《BMC evolutionary biology》2008,8(1):1-17

相似文献

17.

Systematic determination of patterns of gene expression during Drosophila embryogenesis

下载免费PDF全文

Tomancak P Beaton A Weiszmann R Kwan E Shu S Lewis SE Richards S Ashburner M Hartenstein V Celniker SE Rubin GM 《Genome biology》2002,3(12):research0088.1-8814

相似文献

18.

Cluster-Rasch models for microarray gene expression data 总被引：1，自引：0，他引：1

Li H Hong F 《Genome biology》2001,2(8):research0031.1-research003113

Background

We propose two different formulations of the Rasch statistical models to the problem of relating gene expression profiles to the phenotypes. One formulation allows us to investigate whether a cluster of genes with similar expression profiles is related to the observed phenotypes; this model can also be used for future prediction. The other formulation provides an alternative way of identifying genes that are over- or underexpressed from their expression levels in tissue or cell samples of a given tissue or cell type.

Results

We illustrate the methods on available datasets of a classification of acute leukemias and of 60 cancer cell lines. For tumor classification, the results are comparable to those previously obtained. For the cancer cell lines dataset, we found four clusters of genes that are related to drug response for many of the 90 drugs that we considered. In addition, for each type of cell line, we identified genes that are over- or underexpressed relative to other genes.

Conclusions

The cluster-Rasch model provides a probabilistic model for describing gene expression patterns across samples and can be used to relate gene expression profiles to phenotypes. 相似文献

19.

Differential gene expression in the salivary gland during development and onset of xerostomia in Sjögren's syndrome-like disease of the C57BL/6.NOD-Aec1Aec2 mouse

Cuong Q Nguyen Ashok Sharma Byung Ha Lee Jin-Xiong She Richard A McIndoe Ammon B Peck 《Arthritis research & therapy》2009,11(2):1-16

相似文献

20.

Bayesian meta-analysis models for microarray data: a comparative study

Erin M Conlon Joon J Song Anna Liu 《BMC bioinformatics》2007,8(1):1-21

Background

With the growing abundance of microarray data, statistical methods are increasingly needed to integrate results across studies. Two common approaches for meta-analysis of microarrays include either combining gene expression measures across studies or combining summaries such as p-values, probabilities or ranks. Here, we compare two Bayesian meta-analysis models that are analogous to these methods.

Results

Two Bayesian meta-analysis models for microarray data have recently been introduced. The first model combines standardized gene expression measures across studies into an overall mean, accounting for inter-study variability, while the second combines probabilities of differential expression without combining expression values. Both models produce the gene-specific posterior probability of differential expression, which is the basis for inference. Since the standardized expression integration model includes inter-study variability, it may improve accuracy of results versus the probability integration model. However, due to the small number of studies typical in microarray meta-analyses, the variability between studies is challenging to estimate. The probability integration model eliminates the need to model variability between studies, and thus its implementation is more straightforward. We found in simulations of two and five studies that combining probabilities outperformed combining standardized gene expression measures for three comparison values: the percent of true discovered genes in meta-analysis versus individual studies; the percent of true genes omitted in meta-analysis versus separate studies, and the number of true discovered genes for fixed levels of Bayesian false discovery. We identified similar results when pooling two independent studies of Bacillus subtilis. We assumed that each study was produced from the same microarray platform with only two conditions: a treatment and control, and that the data sets were pre-scaled.

Conclusion

The Bayesian meta-analysis model that combines probabilities across studies does not aggregate gene expression measures, thus an inter-study variability parameter is not included in the model. This results in a simpler modeling approach than aggregating expression measures, which accounts for variability across studies. The probability integration model identified more true discovered genes and fewer true omitted genes than combining expression measures, for our data sets. 相似文献