首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Meta-analysis of gene expression microarray datasets presents significant challenges for statistical analysis. We developed and validated a new bioinformatic method for the identification of genes upregulated in subsets of samples of a given tumour type (‘outlier genes’), a hallmark of potential oncogenes.

Methodology

A new statistical method (the gene tissue index, GTI) was developed by modifying and adapting algorithms originally developed for statistical problems in economics. We compared the potential of the GTI to detect outlier genes in meta-datasets with four previously defined statistical methods, COPA, the OS statistic, the t-test and ORT, using simulated data. We demonstrated that the GTI performed equally well to existing methods in a single study simulation. Next, we evaluated the performance of the GTI in the analysis of combined Affymetrix gene expression data from several published studies covering 392 normal samples of tissue from the central nervous system, 74 astrocytomas, and 353 glioblastomas. According to the results, the GTI was better able than most of the previous methods to identify known oncogenic outlier genes. In addition, the GTI identified 29 novel outlier genes in glioblastomas, including TYMS and CDKN2A. The over-expression of these genes was validated in vivo by immunohistochemical staining data from clinical glioblastoma samples. Immunohistochemical data were available for 65% (19 of 29) of these genes, and 17 of these 19 genes (90%) showed a typical outlier staining pattern. Furthermore, raltitrexed, a specific inhibitor of TYMS used in the therapy of tumour types other than glioblastoma, also effectively blocked cell proliferation in glioblastoma cell lines, thus highlighting this outlier gene candidate as a potential therapeutic target.

Conclusions/Significance

Taken together, these results support the GTI as a novel approach to identify potential oncogene outliers and drug targets. The algorithm is implemented in an R package (Text S1).  相似文献   

2.
We developed PathAct, a novel method for pathway analysis to investigate the biological and clinical implications of the gene expression profiles. The advantage of PathAct in comparison with the conventional pathway analysis methods is that it can estimate pathway activity levels for individual patient quantitatively in the form of a pathway-by-sample matrix. This matrix can be used for further analysis such as hierarchical clustering and other analysis methods. To evaluate the feasibility of PathAct, comparison with frequently used gene-enrichment analysis methods was conducted using two public microarray datasets. The dataset #1 was that of breast cancer patients, and we investigated pathways associated with triple-negative breast cancer by PathAct, compared with those obtained by gene set enrichment analysis (GSEA). The dataset #2 was another breast cancer dataset with disease-free survival (DFS) of each patient. Contribution by each pathway to prognosis was investigated by our method as well as the Database for Annotation, Visualization and Integrated Discovery (DAVID) analysis. In the dataset #1, four out of the six pathways that satisfied p < 0.05 and FDR < 0.30 by GSEA were also included in those obtained by the PathAct method. For the dataset #2, two pathways (“Cell Cycle” and “DNA replication”) out of four pathways by PathAct were commonly identified by DAVID analysis. Thus, we confirmed a good degree of agreement among PathAct and conventional methods. Moreover, several applications of further statistical analyses such as hierarchical cluster analysis by pathway activity, correlation analysis and survival analysis between pathways were conducted.  相似文献   

3.
4.
In the medical domain, it is very significant to develop a rule-based classification model. This is because it has the ability to produce a comprehensible and understandable model that accounts for the predictions. Moreover, it is desirable to know not only the classification decisions but also what leads to these decisions. In this paper, we propose a novel dynamic quantitative rule-based classification model, namely DQB, which integrates quantitative association rule mining and the Artificial Bee Colony (ABC) algorithm to provide users with more convenience in terms of understandability and interpretability via an accurate class quantitative association rule-based classifier model. As far as we know, this is the first attempt to apply the ABC algorithm in mining for quantitative rule-based classifier models. In addition, this is the first attempt to use quantitative rule-based classification models for classifying microarray gene expression profiles. Also, in this research we developed a new dynamic local search strategy named DLS, which is improved the local search for artificial bee colony (ABC) algorithm. The performance of the proposed model has been compared with well-known quantitative-based classification methods and bio-inspired meta-heuristic classification algorithms, using six gene expression profiles for binary and multi-class cancer datasets. From the results, it can be concludes that a considerable increase in classification accuracy is obtained for the DQB when compared to other available algorithms in the literature, and it is able to provide an interpretable model for biologists. This confirms the significance of the proposed algorithm in the constructing a classifier rule-based model, and accordingly proofs that these rules obtain a highly qualified and meaningful knowledge extracted from the training set, where all subset of quantitive rules report close to 100% classification accuracy with a minimum number of genes. It is remarkable that apparently (to the best of our knowledge) several new genes were discovered that have not been seen in any past studies. For the applicability demand, based on the results acqured from microarray gene expression analysis, we can conclude that DQB can be adopted in a different real world applications with some modifications.  相似文献   

5.
6.
7.
8.
Mining gene expression profiles: expression signatures as cancer phenotypes   总被引:6,自引:0,他引:6  
Many examples highlight the power of gene expression profiles, or signatures, to inform an understanding of biological phenotypes. This is perhaps best seen in the context of cancer, where expression signatures have tremendous power to identify new subtypes and to predict clinical outcomes. Although the ability to interpret the meaning of the individual genes in these signatures remains a challenge, this does not diminish the power of the signature to characterize biological states. The use of these signatures as surrogate phenotypes has been particularly important, linking diverse experimental systems that dissect the complexity of biological systems with the in vivo setting in a way that was not previously feasible.  相似文献   

9.
Identifying genes associated with cancer development is typically accomplished by comparing mean expression values in normal and tumor tissues, which identifies differentially expressed (DE) genes. Interindividual variation (IV) in gene expression is indirectly included in DE gene identification because given the same absolute differences in means, genes with lower variance tend to have lower p-values. We explored the direct use of IV in gene expression to identify candidate genes associated with cancer development. We focused on prostate (PCa) and lung (LC) cancers and compared IV in the expression level of genes shown to be cancer related with that in all other genes in the human genome. Compared with all those other genes, cancer-related genes tended to have greater IV in normal tissues and a greater increase in IV during the transition from normal to tumorous tissue. Genes without significantly different mean expression values between tumor and normal tissues but with greater IV in tumor than in normal tissue (note: the DE-based approach completely ignores those genes) had stronger associations with clinically important features like Gleason score in PCa or tumor histology in LC than all other genes were. Our results suggest that analyzing IV in gene expression level is useful in identifying novel candidate genes associated with cancer development.  相似文献   

10.
Yan X  Zheng T 《BMC genomics》2008,9(Z2):S14

Background

Gene expression data extracted from microarray experiments have been used to study the difference between mRNA abundance of genes under different conditions. In one of such experiments, thousands of genes are measured simultaneously, which provides a high-dimensional feature space for discriminating between different sample classes. However, most of these dimensions are not informative about the between-class difference, and add noises to the discriminant analysis.

Results

In this paper we propose and study feature selection methods that evaluate the "informativeness" of a set of genes. Two measures of information based on multigene expression profiles are considered for a backward information-driven screening approach for selecting important gene features. By considering multigene expression profiles, we are able to utilize interaction information among these genes. Using a breast cancer data, we illustrate our methods and compare them to the performance of existing methods.

Conclusion

We illustrate in this paper that methods considering gene-gene interactions have better classification power in gene expression analysis. In our results, we identify important genes with relative large p-values from single gene tests. This indicates that these are genes with weak marginal information but strong interaction information, which will be overlooked by strategies that only examine individual genes.
  相似文献   

11.
Most of the conventional feature selection algorithms have a drawback whereby a weakly ranked gene that could perform well in terms of classification accuracy with an appropriate subset of genes will be left out of the selection. Considering this shortcoming, we propose a feature selection algorithm in gene expression data analysis of sample classifications. The proposed algorithm first divides genes into subsets, the sizes of which are relatively small (roughly of size h), then selects informative smaller subsets of genes (of size r < h) from a subset and merges the chosen genes with another gene subset (of size r) to update the gene subset. We repeat this process until all subsets are merged into one informative subset. We illustrate the effectiveness of the proposed algorithm by analyzing three distinct gene expression data sets. Our method shows promising classification accuracy for all the test data sets. We also show the relevance of the selected genes in terms of their biological functions.  相似文献   

12.
Libraries of randomized ribozymes have considerable potential as tools for the identification of functional genes critically involved in a biological phenotype of interest in vitro. We have used a ribozyme library in an in vivo mouse model to identify genes related to metastasis. We injected weakly metastatic melanoma cells that had been treated with the library intravenously into mice. We then isolated ribozymes that accelerated metastasis from pulmonary tumors that had developed from metastasizing cells. As candidates for metastasis-related genes that were targets of the isolated ribozymes, we identified five unknown and three known genes: stromal interaction molecule 1 (STIM1), polymerase gamma2 accessory subunit (Polg2), and cytochrome P450, family 2, subfamily d, polypeptide 22 (Cyp2d22). Repression of four of these by small interfering RNAs indeed resulted in the accelerated mobility of cells in in vitro scratch-wound assay. The further characterization of these candidate genes would provide clues to the complex mechanism(s) of metastasis.  相似文献   

13.
14.
15.
Acetohydroxyacid synthase (AHAS) is the target enzyme for a number of herbicides. A S653N mutation in the AHAS gene results in an increased tolerance to imidazolinone herbicides. We have investigated the use of the mutated gene as selection gene for potato transformation. This resulted in a transformation system with a very high transformation frequency and low rate of escapes. The mutated AHAS gene was introduced into transformed potato together with a -glucuronidase (GUS) gene. Selection on 0.5 M Imazamox yielded GUS expression in 93–100% of regenerated shoots. Furthermore the mutated AHAS gene was used as selection gene for production of high-amylopectin potato lines. The high transformation frequency was verified and potato lines with the desirable starch quality were obtained.Abbreviations ABA Abscisic acid - AHAS Acetohydroxyacid synthase - BAP 6-Benzylaminopurine - 2,4-D 2, 4-Dichlorophenoxyacetic acid - GA3 Gibberellic acid - GBSS Granule bound starch synthase - GUS -Glucuronidase - MS medium Murashige and Skoog medium - NAA -Naphthaleneacetic acid - nos Nopaline synthase - OCS Octopine synthase - PCR Polymerase chain reaction - X-gluc 5-Bromo-4-chloro-3-indolyl-beta-d-glucuronic acid - YEB Yeast extract brothCommunicated by R. Schmidt  相似文献   

16.
17.
Metastasis represents the ultimate target in cancer therapy as this complex biological process is the direct cause of mortality for a variety of human malignancies. The current high level of mortality from prostate cancer results in large part from the inexorable growth of overt or occult metastasis present at the time of diagnosis. Currently, there are no curative therapies for metastatic prostate cancer. To better understand the metastatic phenotype in prostate cancer, we developed a strategy to identify mRNAs that are expressed differentially in cell lines derived from primary versus metastatic mouse prostate cancer using differential display-PCR. In using this system a number of metastasis-related sequences were identified including a cDNA that encodes caveolin-1. Caveolin-1 was found to be overexpressed not only in metastatic mouse prostate cancer, but also in human metastatic disease. Recent studies have indicated that suppression of caveolin-1 expression induces androgen sensitivity in high caveolin-1, androgen-insensitive mouse prostate cancer cells derived from metastases. Conversely, overexpression of caveolin-1 leads to androgen insensitivity in low caveolin, androgen-sensitive mouse prostate cancer cells. Caveolin-1, therefore, is both a metastasis-related gene as well as a candidate androgen resistance gene for prostate cancer in man. Interestingly, recent studies also point to a potential role for caveolin-1 in the resistance of various malignancies to multiple antineoplastic agents. The linkage of caveolin-1 expression with the androgen-resistant phenotype in prostate cancer and the multidrug resistance phenotype in various solid tumors establishes a novel paradigm for understanding these clinically important and now potentially related processes in malignant progression.  相似文献   

18.
Xu W  Wang M  Zhang X  Wang L  Feng H 《Bioinformation》2008,2(7):301-303
Gene selection is to detect the most significantly expressed genes under different conditions expression data. The current challenge in gene selection is the comparison of a large number of genes with limited patient samples. Thus it is trivial task in simple statistical analysis. Various statistical measurements are adopted by filter methods applied in gene selection studies. Their ability to discriminate phenotypes is crucial in classification and selection. Here we describe the standard deviation error distribution (SDED) method for gene selection. It utilizes variations within-class and among-class in gene expression data. We tested the method using 4 leukemia datasets available in the public domain. The method was compared with the GS2 and CHO methods. The Prediction accuracies by SDED are better than both GS2 and CHO for different datasets. These are 0.8-4.2% and 1.6-8.4% more that in GS2 and CHO. The related OMIM annotations and KEGG pathways analyses verified that SDED can pick out more 4.0% and 6.1% genes with biological significance than GS2 and CHO, respectively.  相似文献   

19.
We propose a general framework for prediction of predefined tumor classes using gene expression profiles from microarray experiments. The framework consists of 1) evaluating the appropriateness of class prediction for the given data set, 2) selecting the prediction method, 3) performing cross-validated class prediction, and 4) assessing the significance of prediction results by permutation testing. We describe an application of the prediction paradigm to gene expression profiles from human breast cancers, with specimens classified as positive or negative for BRCA1 mutations and also for BRCA2 mutations. In both cases, the accuracy of class prediction was statistically significant when compared to the accuracy of prediction expected by chance. The framework proposed here for the application of class prediction is designed to reduce the occurrence of spurious findings, a legitimate concern for high-dimensional microarray data. The prediction paradigm will serve as a good framework for comparing different prediction methods and may accelerate the development of molecular classifiers that are clinically useful.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号