首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.

Background  

Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear.  相似文献   

3.
4.
Microarray technologies, which can measure tens of thousands of gene expression values simultaneously in a single experiment, have become a common research method for biomedical researchers. Computational tools to analyze microarray data for biological discovery are needed. In this paper, we investigate the feasibility of using formal concept analysis (FCA) as a tool for microarray data analysis. The method of FCA builds a (concept) lattice from the experimental data together with additional biological information. For microarray data, each vertex of the lattice corresponds to a subset of genes that are grouped together according to their expression values and some biological information related to gene function. The lattice structure of these gene sets might reflect biological relationships in the dataset. Similarities and differences between experiments can then be investigated by comparing their corresponding lattices according to various graph measures. We apply our method to microarray data derived from influenza-infected mouse lung tissue and healthy controls. Our preliminary results show the promise of our method as a tool for microarray data analysis.  相似文献   

5.
During recent years, microarrays have been firmly established as valuable tools for the discovery of novel biological phenomena. Especially in combination with whole genome sequences, microarray data can help unravel the dynamics of the expressed genome. For filamentous fungi, microarray studies have already been performed with more than 20 different species; these investigations have explored a variety of different aspects of fungal biology. In this review, I will give an overview of some of the key questions that have been addressed using microarray hybridizations with filamentous fungi, with particular focus on the analysis of co-regulated pathways and physically clustered genes, as well as on the use of microarray data to determine a molecular phenotype. Additionally, a number of useful, freely available software tools for the analysis of fungal microarray data will be discussed.  相似文献   

6.
Many statistical methods have been developed to screen for differentially expressed genes associated with specific phenotypes in the microarray data. However, it remains a major challenge to synthesize the observed expression patterns with abundant biological knowledge for more complete understanding of the biological functions among genes. Various methods including clustering analysis on genes, neural network, Bayesian network and pathway analysis have been developed toward this goal. In most of these procedures, the activation and inhibition relationships among genes have hardly been utilized in the modeling steps. We propose two novel Bayesian models to integrate the microarray data with the putative pathway structures obtained from the KEGG database and the directional gene–gene interactions in the medical literature. We define the symmetric Kullback–Leibler divergence of a pathway, and use it to identify the pathway(s) most supported by the microarray data. Monte Carlo Markov Chain sampling algorithm is given for posterior computation in the hierarchical model. The proposed method is shown to select the most supported pathway in an illustrative example. Finally, we apply the methodology to a real microarray data set to understand the gene expression profile of osteoblast lineage at defined stages of differentiation. We observe that our method correctly identifies the pathways that are reported to play essential roles in modulating bone mass.  相似文献   

7.
Huang GS  Hong MY  Liu YC 《Life sciences》2003,72(22):2525-2531
We incorporated gene expression information from cDNA microarray into flux analysis to simulate yeast diauxic growth. Expression ratios of both growth phases were applied to assign the split ratio at glyoxylate shunt during simulation, in which the equation was mathematically unsolvable due to the singularity and artificial split ratios, which were traditionally introduced without biological evidence. In addition, the directionality of microarray dataset was used as a further constraint during simulation. Metabolic fluxes obtained by this modified approach are in general consistent with microarray analysis. However, discrepancies occurred when the quantity of fluxes was compared, probably due to the substantial reduction of substrates at phase II in which the increase in the enzymatic levels was not proportional to the increase of substrate flow, as would be predicted from microarray dataset. The modified flux analysis might have brought a new approach to investigate other cellular pathways.  相似文献   

8.
MOTIVATION: The diverse microarray datasets that have become available over the past several years represent a rich opportunity and challenge for biological data mining. Many supervised and unsupervised methods have been developed for the analysis of individual microarray datasets. However, integrated analysis of multiple datasets can provide a broader insight into genetic regulation of specific biological pathways under a variety of conditions. RESULTS: To aid in the analysis of such large compendia of microarray experiments, we present Microarray Experiment Functional Integration Technology (MEFIT), a scalable Bayesian framework for predicting functional relationships from integrated microarray datasets. Furthermore, MEFIT predicts these functional relationships within the context of specific biological processes. All results are provided in the context of one or more specific biological functions, which can be provided by a biologist or drawn automatically from catalogs such as the Gene Ontology (GO). Using MEFIT, we integrated 40 Saccharomyces cerevisiae microarray datasets spanning 712 unique conditions. In tests based on 110 biological functions drawn from the GO biological process ontology, MEFIT provided a 5% or greater performance increase for 54 functions, with a 5% or more decrease in performance in only two functions.  相似文献   

9.
10.

Background  

Numerous nonparametric approaches have been proposed in literature to detect differential gene expression in the setting of two user-defined groups. However, there is a lack of nonparametric procedures to analyze microarray data with multiple factors attributing to the gene expression. Furthermore, incorporating interaction effects in the analysis of microarray data has long been of great interest to biological scientists, little of which has been investigated in the nonparametric framework.  相似文献   

11.
Quality control of a microarray experiment has become an important issue for both research and regulation. External RNA controls (ERCs), which can be either added to the total RNA level (tERCs) or introduced right before hybridization (cERCs), are designed and recommended by commercial microarray platforms for assessment of performance of a microarray experiment. However, the utility of ERCs has not been fully realized mainly due to the lack of sufficient data resources. The US Food and Drug Administration (FDA)-led community-wide Microarray Quality Control (MAQC) study generates a large amount of microarray data with implementation of ERCs across several commercial microarray platforms. The utility of ERCs in quality control by assessing the ERCs’ concentration-response behavior was investigated in the MAQC study. In this work, an ERC-based correlation analysis was conducted to assess the quality of a microarray experiment. We found that the pairwise correlations of tERCs are sample independent, indicating that the array data obtained from different biological samples can be treated as technical replicates in analysis of tERCs. Consequently, the commonly used quality control method of applying correlation analysis on technical replicates can be adopted for assessing array performance based on different biological samples using tERCs. The proposed approach is sensitive to identifying outlying assays and is not dependent on the choice of normalization method.  相似文献   

12.
Do JH  Choi DK 《Molecules and cells》2006,22(3):254-261
DNA microarray is a powerful tool for high-throughput analysis of biological systems. Various computational tools have been created to facilitate the analysis of the large volume of data produced in DNA microarray experiments. Normalization is a critical step for obtaining data that are reliable and usable for subsequent analysis such as identification of differentially expressed genes and clustering. A variety of normalization methods have been proposed over the past few years, but no methods are still perfect. Various assumptions are often taken in the process of normalization. Therefore, the knowledge of underlying assumption and principle of normalization would be helpful for the correct analysis of microarray data. We present a review of normalization techniques from single-labeled platforms such as the Affymetrix GeneChip array to dual-labeled platforms like spotted array focusing on their principles and assumptions.  相似文献   

13.
MOTIVATION: The identification of physiological processes underlying and generating the expression pattern observed in microarray experiments is a major challenge. Principal component analysis (PCA) is a linear multivariate statistical method that is regularly employed for that purpose as it provides a reduced-dimensional representation for subsequent study of possible biological processes responding to the particular experimental conditions. Making explicit the data assumptions underlying PCA highlights their lack of biological validity thus making biological interpretation of the principal components problematic. A microarray data representation which enables clear biological interpretation is a desirable analysis tool. RESULTS: We address this issue by employing the probabilistic interpretation of PCA and proposing alternative linear factor models which are based on refined biological assumptions. A practical study on two well-understood microarray datasets highlights the weakness of PCA and the greater biological interpretability of the linear models we have developed.  相似文献   

14.
MOTIVATION: Microarrays are an important research tool for the advancement of basic biological sciences. However this technology has yet to be integrated with clinical decision making. We have implemented an information framework based on the Microarray Gene Expression Markup Language (MAGE-ML) specification. We are using this framework to develop a test-bed integrated database application to identify genomic and imaging markers for diagnosis of breast cancer. RESULTS: We developed extensible software architecture for retrieving data from different microarray databases using MAGE-ML and for combining microarray data with breast cancer image analysis and clinical data for correlation studies. The framework we developed will provide the necessary data integration to move microarray research from basic biological sciences to clinical applications. AVAILABILITY: Open source software will be available from SourceForge (http://sourceforge.net/projects/microsoap/).  相似文献   

15.
16.
Pathway analysis using random forests classification and regression   总被引:3,自引:0,他引:3  
MOTIVATION: Although numerous methods have been developed to better capture biological information from microarray data, commonly used single gene-based methods neglect interactions among genes and leave room for other novel approaches. For example, most classification and regression methods for microarray data are based on the whole set of genes and have not made use of pathway information. Pathway-based analysis in microarray studies may lead to more informative and relevant knowledge for biological researchers. RESULTS: In this paper, we describe a pathway-based classification and regression method using Random Forests to analyze gene expression data. The proposed methods allow researchers to rank important pathways from externally available databases, discover important genes, find pathway-based outlying cases and make full use of a continuous outcome variable in the regression setting. We also compared Random Forests with other machine learning methods using several datasets and found that Random Forests classification error rates were either the lowest or the second-lowest. By combining pathway information and novel statistical methods, this procedure represents a promising computational strategy in dissecting pathways and can provide biological insight into the study of microarray data. AVAILABILITY: Source code written in R is available from http://bioinformatics.med.yale.edu/pathway-analysis/rf.htm.  相似文献   

17.
The comparison of gene expression profiles among DNA microarray experiments enables the identification of unknown relationships among experiments to uncover the underlying biological relationships. Despite the ongoing accumulation of data in public databases, detecting biological correlations among gene expression profiles from multiple laboratories on a large scale remains difficult. Here, we applied a module (sets of genes working in the same biological action)-based correlation analysis in combination with a network analysis to Arabidopsis data and developed a 'module-based correlation network' (MCN) which represents relationships among DNA microarray experiments on a large scale. We developed a Web-based data analysis tool, 'AtCAST' (Arabidopsis thaliana: DNA Microarray Correlation Analysis Tool), which enables browsing of an MCN or mining of users' microarray data by mapping the data into an MCN. AtCAST can help researchers to find novel connections among DNA microarray experiments, which in turn will help to build new hypotheses to uncover physiological mechanisms or gene functions in Arabidopsis.  相似文献   

18.
MOTIVATION: Two-dimensional Difference Gel Electrophoresis (DIGE) measures expression differences for thousands of proteins in parallel. In contrast to DNA microarray analysis, however, there have been few systematic studies on the validity of differential protein expression analysis, and the effects of normalization methods have not yet been investigated. To address this need, we assessed a series of same-same comparisons, evaluating how random experimental variance influenced differential expression analysis. RESULTS: The strong fluctuations observed were reflected in large discrepancies between the distributions of the spot intensities for different gels. Correct normalization for pooling of multiple gels for analysis is, therefore, essential. We show that both dye-specific background levels and the differences in scale of the spot intensity distributions must be accounted for. A variance stabilizing transform that had been developed for DNA microarray analysis combined with a robust Z-score allowed the determination of gel-independent signal thresholds based on the empirical distributions from same-same comparisons. In contrast, similar thresholds holding up to cross-validation could not be proposed for data normalized using methods established in the field of proteomics. AVAILABILITY: Software is available on request from the authors. SUPPLEMENTARY INFORMATION: There is supplementary material available online at http://www.flychip.org.uk/kreil/pub/2dgels/  相似文献   

19.
In DNA microarray studies, gene-set analysis (GSA) has become the focus of gene expression data analysis. GSA utilizes the gene expression profiles of functionally related gene sets in Gene Ontology (GO) categories or priori-defined biological classes to assess the significance of gene sets associated with clinical outcomes or phenotypes. Many statistical approaches have been proposed to determine whether such functionally related gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to the discriminatory power of gene sets and classification of patients.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号