首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

The selection of genes that discriminate disease classes from microarray data is widely used for the identification of diagnostic biomarkers. Although various gene selection methods are currently available and some of them have shown excellent performance, no single method can retain the best performance for all types of microarray datasets. It is desirable to use a comparative approach to find the best gene selection result after rigorous test of different methodological strategies for a given microarray dataset.  相似文献   

2.
Computational analysis of microarray data   总被引:1,自引:0,他引:1  
Microarray experiments are providing unprecedented quantities of genome-wide data on gene-expression patterns. Although this technique has been enthusiastically developed and applied in many biological contexts, the management and analysis of the millions of data points that result from these experiments has received less attention. Sophisticated computational tools are available, but the methods that are used to analyse the data can have a profound influence on the interpretation of the results. A basic understanding of these computational tools is therefore required for optimal experimental design and meaningful data analysis.  相似文献   

3.
MOTIVATION: Most supervised classification methods are limited by the requirement for more cases than variables. In microarray data the number of variables (genes) far exceeds the number of cases (arrays), and thus filtering and pre-selection of genes is required. We describe the application of Between Group Analysis (BGA) to the analysis of microarray data. A feature of BGA is that it can be used when the number of variables (genes) exceeds the number of cases (arrays). BGA is based on carrying out an ordination of groups of samples, using a standard method such as Correspondence Analysis (COA), rather than an ordination of the individual microarray samples. As such, it can be viewed as a method of carrying out COA with grouped data. RESULTS: We illustrate the power of the method using two cancer data sets. In both cases, we can quickly and accurately classify test samples from any number of specified a priori groups and identify the genes which characterize these groups. We obtained very high rates of correct classification, as determined by jack-knife or validation experiments with training and test sets. The results are comparable to those from other methods in terms of accuracy but the power and flexibility of BGA make it an especially attractive method for the analysis of microarray cancer data.  相似文献   

4.
Fundamentals of cDNA microarray data analysis   总被引:15,自引:0,他引:15  
Microarray technology is a powerful approach for genomics research. The multi-step, data-intensive nature of this technology has created an unprecedented informatics and analytical challenge. It is important to understand the crucial steps that can affect the outcome of the analysis. In this review, we provide an overview of the contemporary trend on various main analysis steps in the microarray data analysis process, which includes experimental design, data standardization, image acquisition and analysis, normalization, statistical significance inference, exploratory data analysis, class prediction and pathway analysis, as well as various considerations relevant to their implementation.  相似文献   

5.
Microarray technology is associated with many sources of experimentaluncertainty. In this review we discuss a number of approachesfor dealing with this uncertainty in the processing of datafrom microarray experiments. We focus here on the analysis ofhigh-density oligonucleotide arrays, such as the popular AffymetrixGeneChip® array, which contain multiple probes for eachtarget. This set of probes can be used to determine an estimatefor the target concentration and can also be used to determinethe experimental uncertainty associated with this measurement.This measurement uncertainty can then be propagated throughthe downstream analysis using probabilistic methods. We giveexamples showing how these credibility intervals can be usedto help identify differential expression, to combine informationfrom replicated experiments and to improve the performance ofprincipal component analysis.   相似文献   

6.
Leung YF  Lam DS  Pang CP 《Genome biology》2001,2(9):reports4021.1-reports40212
A report on the tenth Annual Bioinformatics and Genome Research meeting of the Cambridge Healthtech Institute's Beyond Genome 2001 series, San Francisco, USA, 17-19 June 2001.  相似文献   

7.
AMADA: analysis of microarray data   总被引:9,自引:0,他引:9  
SUMMARY: AMADA is a Windows program for identifying co-expressed genes from microarray data. It performs data transformation, principal component analysis, a variety of cluster analyses and extensive graphic functions for visualizing expression profiles.  相似文献   

8.
MOTIVATION: Methods for analyzing cancer microarray data often face two distinct challenges: the models they infer need to perform well when classifying new tissue samples while at the same time providing an insight into the patterns and gene interactions hidden in the data. State-of-the-art supervised data mining methods often cover well only one of these aspects, motivating the development of methods where predictive models with a solid classification performance would be easily communicated to the domain expert. RESULTS: Data visualization may provide for an excellent approach to knowledge discovery and analysis of class-labeled data. We have previously developed an approach called VizRank that can score and rank point-based visualizations according to degree of separation of data instances of different class. We here extend VizRank with techniques to uncover outliers, score features (genes) and perform classification, as well as to demonstrate that the proposed approach is well suited for cancer microarray analysis. Using VizRank and radviz visualization on a set of previously published cancer microarray data sets, we were able to find simple, interpretable data projections that include only a small subset of genes yet do clearly differentiate among different cancer types. We also report that our approach to classification through visualization achieves performance that is comparable to state-of-the-art supervised data mining techniques. AVAILABILITY: VizRank and radviz are implemented as part of the Orange data mining suite (http://www.ailab.si/orange). SUPPLEMENTARY INFORMATION: Supplementary data are available from http://www.ailab.si/supp/bi-cancer.  相似文献   

9.
Microchip arrays have become one of the most rapidly growing techniques for monitoring gene expression at the genomic level and thereby gaining valuable insight about various important biological mechanisms. Examples of such mechanisms are: identifying disease-causing genes, genes involved in the regulation of some aspect of the cell cycle, etc. In this article, we discuss the problem of estimating gene expression based on a proper statistical model. More precisely, we show how the model introduced by Li and Wong can be used in its full bivariate generality to provide a new measure of gene expression from high-density oligonucleotide arrays. We also present a second gene expression index based on a new way of reducing the model into a simpler univariate model. In both cases, the gene expression indices are shown to be unbiased and to have lower variance than the established ones. Moreover, we present a bootstrap method aiming at providing non-parametric confidence intervals for the expression index.  相似文献   

10.
11.
The development of microarray technology allows the simultaneous measurement of the expression of many thousands of genes. The information gained offers an unprecedented opportunity to fully characterize biological processes. However, this challenge will only be successful if new tools for the efficient integration and interpretation of large datasets are available. One of these tools, pathway analysis, involves looking for consistent but subtle changes in gene expression by incorporating either pathway or functional annotations. We review several methods of pathway analysis and compare the performance of three, the binomial distribution, z scores, and gene set enrichment analysis, on two microarray datasets. Pathway analysis is a promising tool to identify the mechanisms that underlie diseases, adaptive physiological compensatory responses and new avenues for investigation.  相似文献   

12.
13.
Model-based cluster analysis of microarray gene-expression data   总被引:3,自引:0,他引:3  
Pan W  Lin J  Le CT 《Genome biology》2002,3(2):research0009.1-research00098

Background

Microarray technologies are emerging as a promising tool for genomic studies. The challenge now is how to analyze the resulting large amounts of data. Clustering techniques have been widely applied in analyzing microarray gene-expression data. However, normal mixture model-based cluster analysis has not been widely used for such data, although it has a solid probabilistic foundation. Here, we introduce and illustrate its use in detecting differentially expressed genes. In particular, we do not cluster gene-expression patterns but a summary statistic, the t-statistic.

Results

The method is applied to a data set containing expression levels of 1,176 genes of rats with and without pneumococcal middle-ear infection. Three clusters were found, two of which contain more than 95% genes with almost no altered gene-expression levels, whereas the third one has 30 genes with more or less differential gene-expression levels.

Conclusions

Our results indicate that model-based clustering of t-statistics (and possibly other summary statistics) can be a useful statistical tool to exploit differential gene expression for microarray data.  相似文献   

14.
Analysing microarray data using modular regulation analysis   总被引:3,自引:0,他引:3  
MOTIVATION: Microarray experiments measure complex changes in the abundance of many mRNAs under different conditions. Current analysis methods cannot distinguish between direct and indirect effects on expression, or calculate the relative importance of mRNAs in effecting responses. RESULTS: Application of modular regulation analysis to microarray data reveals and quantifies which mRNA changes are important for cellular responses. The mRNAs are clustered, and then we calculate how perturbations alter each cluster and how strongly those clusters affect an output response. The product of these values quantifies how an input changes a response through each cluster. Two published datasets are analysed. Two mRNA clusters transmit most of the response of yeast doubling time to galactose; one contains mainly galactose metabolic genes, and the other a regulatory gene. Analysis of the response of yeast relative fitness to 2-deoxy-D-glucose reveals that control is distributed between several mRNA clusters, but experimental error limits statistical significance.  相似文献   

15.
Genesis: cluster analysis of microarray data   总被引:26,自引:0,他引:26  
  相似文献   

16.
MOTIVATION: Microarray technology makes it possible to measure thousands of variables and to compare their values under hundreds of conditions. Once microarray data are quantified, normalized and classified, the analysis phase is essentially a manual and subjective task based on visual inspection of classes in the light of the vast amount of information available. Currently, data interpretation clearly constitutes the bottleneck of such analyses and there is an obvious need for tools able to fill the gap between data processed with mathematical methods and existing biological knowledge. RESULTS: THEA (Tools for High-throughput Experiments Analysis) is an integrated information processing system allowing convenient handling of data. It allows to automatically annotate data issued from classification systems with selected biological information coming from a knowledge base and to either manually search and browse through these annotations or automatically generate meaningful generalizations according to statistical criteria (data mining). AVAILABILITY: The software is available on the website http://thea.unice.fr/  相似文献   

17.
MOTIVATION: The statistical analysis of microarray data usually proceeds in a sequential manner, with the output of the previous step always serving as the input of the next one. However, the methods currently used in such analyses do not properly account for the fact that the intermediate results may not always be correct, then leading to cumulating error in the inferences drawn based on such steps. RESULTS: Here we show that, by an application of hierarchical Bayesian methodology, this sequential procedure can be replaced by a single joint analysis, while systematically accounting for the uncertainties in this process. Moreover, we can also integrate relevant functional information available from databases into such an analysis, thereby increasing the reliability of the biological conclusions that are drawn. We illustrate these points by analysing real data and by showing that the genes can be divided into categories of interest, with the defining characteristic depending on the biological question that is considered. We contend that the proposed method has advantages at two levels. First, there are gains in the statistical and biological results from the analysis of this particular dataset. Second, it opens up new possibilities in analysing microarray data in general.  相似文献   

18.
Regression approaches for microarray data analysis.   总被引:6,自引:0,他引:6  
A variety of new procedures have been devised to handle the two-sample comparison (e.g., tumor versus normal tissue) of gene expression values as measured with microarrays. Such new methods are required in part because of some defining characteristics of microarray-based studies: (i) the very large number of genes contributing expression measures which far exceeds the number of samples (observations) available and (ii) the fact that by virtue of pathway/network relationships, the gene expression measures tend to be highly correlated. These concerns are exacerbated in the regression setting, where the objective is to relate gene expression, simultaneously for multiple genes, to some external outcome or phenotype. Correspondingly, several methods have been recently proposed for addressing these issues. We briefly critique some of these methods prior to a detailed evaluation of gene harvesting. This reveals that gene harvesting, without additional constraints, can yield artifactual solutions. Results obtained employing such constraints motivate the use of regularized regression procedures such as the lasso, least angle regression, and support vector machines. Model selection and solution multiplicity issues are also discussed. The methods are evaluated using a microarray-based study of cardiomyopathy in transgenic mice.  相似文献   

19.
SVDMAN--singular value decomposition analysis of microarray data   总被引:1,自引:0,他引:1  
SUMMARY: We have developed two novel methods for Singular Value Decomposition analysis (SVD) of microarray data. The first is a threshold-based method for obtaining gene groups, and the second is a method for obtaining a measure of confidence in SVD analysis. Gene groups are obtained by identifying elements of the left singular vectors, or gene coefficient vectors, that are greater in magnitude than the threshold W N(-1/2), where N is the number of genes, and W is a weight factor whose default value is 3. The groups are non-exclusive and may contain genes of opposite (i.e. inversely correlated) regulatory response. The confidence measure is obtained by systematically deleting assays from the data set, interpolating the SVD of the reduced data set to reconstruct the missing assay, and calculating the Pearson correlation between the reconstructed assay and the original data. This confidence measure is applicable when each experimental assay corresponds to a value of parameter that can be interpolated, such as time, dose or concentration. Algorithms for the grouping method and the confidence measure are available in a software application called SVD Microarray ANalysis (SVDMAN). In addition to calculating the SVD for generic analysis, SVDMAN provides a new means for using microarray data to develop hypotheses for gene associations and provides a measure of confidence in the hypotheses, thus extending current SVD research in the area of global gene expression analysis.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号