首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
4.
MADGene is a software environment comprising a web-based database and a java application. This platform aims at unifying gene identifiers (ids) and performing gene set analysis. MADGene allows the user to perform inter-conversion of clone and gene ids over a large range of nomenclatures relative to 17 species. We propose a set of 23 functions to facilitate the analysis of gene sets and we give two microarray applications to show how MADGene can be used to conduct meta-analyses. AVAILABILITY: The MADGene resources are freely available online from http://www.madtools.org, a website dedicated to the analysis and annotation of DNA microarray data.  相似文献   

5.

Background  

The incorporation of statistical models that account for experimental variability provides a necessary framework for the interpretation of microarray data. A robust experimental design coupled with an analysis of variance (ANOVA) incorporating a model that accounts for known sources of experimental variability can significantly improve the determination of differences in gene expression and estimations of their significance.  相似文献   

6.

Background

Meta-analysis of gene expression microarray datasets presents significant challenges for statistical analysis. We developed and validated a new bioinformatic method for the identification of genes upregulated in subsets of samples of a given tumour type (‘outlier genes’), a hallmark of potential oncogenes.

Methodology

A new statistical method (the gene tissue index, GTI) was developed by modifying and adapting algorithms originally developed for statistical problems in economics. We compared the potential of the GTI to detect outlier genes in meta-datasets with four previously defined statistical methods, COPA, the OS statistic, the t-test and ORT, using simulated data. We demonstrated that the GTI performed equally well to existing methods in a single study simulation. Next, we evaluated the performance of the GTI in the analysis of combined Affymetrix gene expression data from several published studies covering 392 normal samples of tissue from the central nervous system, 74 astrocytomas, and 353 glioblastomas. According to the results, the GTI was better able than most of the previous methods to identify known oncogenic outlier genes. In addition, the GTI identified 29 novel outlier genes in glioblastomas, including TYMS and CDKN2A. The over-expression of these genes was validated in vivo by immunohistochemical staining data from clinical glioblastoma samples. Immunohistochemical data were available for 65% (19 of 29) of these genes, and 17 of these 19 genes (90%) showed a typical outlier staining pattern. Furthermore, raltitrexed, a specific inhibitor of TYMS used in the therapy of tumour types other than glioblastoma, also effectively blocked cell proliferation in glioblastoma cell lines, thus highlighting this outlier gene candidate as a potential therapeutic target.

Conclusions/Significance

Taken together, these results support the GTI as a novel approach to identify potential oncogene outliers and drug targets. The algorithm is implemented in an R package (Text S1).  相似文献   

7.
8.
9.
MOTIVATION: The diverse microarray datasets that have become available over the past several years represent a rich opportunity and challenge for biological data mining. Many supervised and unsupervised methods have been developed for the analysis of individual microarray datasets. However, integrated analysis of multiple datasets can provide a broader insight into genetic regulation of specific biological pathways under a variety of conditions. RESULTS: To aid in the analysis of such large compendia of microarray experiments, we present Microarray Experiment Functional Integration Technology (MEFIT), a scalable Bayesian framework for predicting functional relationships from integrated microarray datasets. Furthermore, MEFIT predicts these functional relationships within the context of specific biological processes. All results are provided in the context of one or more specific biological functions, which can be provided by a biologist or drawn automatically from catalogs such as the Gene Ontology (GO). Using MEFIT, we integrated 40 Saccharomyces cerevisiae microarray datasets spanning 712 unique conditions. In tests based on 110 biological functions drawn from the GO biological process ontology, MEFIT provided a 5% or greater performance increase for 54 functions, with a 5% or more decrease in performance in only two functions.  相似文献   

10.
A large number of common disorders, including cancer, have complex genetic traits, with multiple genetic and environmental components contributing to susceptibility. A literature search revealed that even among several meta-analyses, there were ambiguous results and conclusions. In the current study, we conducted a thorough meta-analysis gathering the published meta-analysis studies previously reported to correlate any random effect or predictive value of genome variations in certain genes for various types of cancer. The overall analysis was initially aimed to result in associations (1) among genes which when mutated lead to different types of cancer (e.g. common metabolic pathways) and (2) between groups of genes and types of cancer. We have meta-analysed 150 meta-analysis articles which included 4,474 studies, 2,452,510 cases and 3,091,626 controls (5,544,136 individuals in total) including various racial groups and other population groups (native Americans, Latinos, Aborigines, etc.). Our results were not only consistent with previously published literature but also depicted novel correlations of genes with new cancer types. Our analysis revealed a total of 17 gene-disease pairs that are affected and generated gene/disease clusters, many of which proved to be independent of the criteria used, which suggests that these clusters are biologically meaningful.  相似文献   

11.
12.
13.

Background

With next-generation sequencing technologies, experiments that were considered prohibitive only a few years ago are now possible. However, while these technologies have the ability to produce enormous volumes of data, the sequence reads are prone to error. This poses fundamental hurdles when genetic diversity is investigated.

Results

We developed ShoRAH, a computational method for quantifying genetic diversity in a mixed sample and for identifying the individual clones in the population, while accounting for sequencing errors. The software was run on simulated data and on real data obtained in wet lab experiments to assess its reliability.

Conclusions

ShoRAH is implemented in C++, Python, and Perl and has been tested under Linux and Mac OS X. Source code is available under the GNU General Public License at http://www.cbg.ethz.ch/software/shorah.  相似文献   

14.
MOTIVATION: Microarray techniques provide a valuable way of characterizing the molecular nature of disease. Unfortunately expense and limited specimen availability often lead to studies with small sample sizes. This makes accurate estimation of variability difficult, since variance estimates made on a gene by gene basis will have few degrees of freedom, and the assumption that all genes share equal variance is unlikely to be true. RESULTS: We propose a model by which the within gene variances are drawn from an inverse gamma distribution, whose parameters are estimated across all genes. This results in a test statistic that is a minor variation of those used in standard linear models. We demonstrate that the model assumptions are valid on experimental data, and that the model has more power than standard tests to pick up large changes in expression, while not increasing the rate of false positives. AVAILABILITY: This method is incorporated into BRB-ArrayTools version 3.0 (http://linus.nci.nih.gov/BRB-ArrayTools.html). SUPPLEMENTARY MATERIAL: ftp://linus.nci.nih.gov/pub/techreport/RVM_supplement.pdf  相似文献   

15.

Background  

Microarray technology is commonly used as a simple screening tool with a focus on selecting genes that exhibit extremely large differential expressions between different phenotypes. It lacks the ability to select genes that change their relationships with other genes in different biological conditions (differentially correlated genes). We intend to enrich the above procedure by proposing a nonparametric selection procedure that selects differentially correlated genes.  相似文献   

16.
MOTIVATION: Microarray experiments generate a high data volume. However, often due to financial or experimental considerations, e.g. lack of sample, there is little or no replication of the experiments or hybridizations. These factors combined with the intrinsic variability associated with the measurement of gene expression can result in an unsatisfactory detection rate of differential gene expression (DGE). Our motivation was to provide an easy to use measure of the success rate of DGE detection that could find routine use in the design of microarray experiments or in post-experiment assessment. RESULTS: In this study, we address the problem of both random errors and systematic biases in microarray experimentation. We propose a mathematical model for the measured data in microarray experiments and on the basis of this model present a t-based statistical procedure to determine DGE. We have derived a formula to determine the success rate of DGE detection that takes into account the number of microarrays, the number of genes, the magnitude of DGE, and the variance from biological and technical sources. The formula and look-up tables based on the formula, can be used to assist in the design of microarray experiments. We also propose an ad hoc method for estimating the fraction of non-differentially expressed genes within a set of genes being tested. This will help to increase the power of DGE detection. AVAILABILITY: The functions to calculate the success rate of DGE detection have been implemented as a Java application, which is accessible at http://www.le.ac.uk/mrctox/microarray_lab/Microarray_Softwares/Microarray_Softwares.htm  相似文献   

17.
Functional genomic technologies such as high density DNA microarrays allow biologists to study the structure and behavior of thousands of genes in a single experiment. One of the fields in which microarrays have had an increasingly important impact is host-pathogen interactions. Early investigations in this area over the past two years not only emphasize the utility of this approach, but also highlight the stereotyped gene expression responses of different host cells to diverse infectious stimuli, and the potential value of broad dataset comparisons in revealing fundamental features of innate immunity. The comparative analysis of recently published datasets involving human gene expression responses to two bacterial respiratory pathogens illustrates many of these points. Comparisons between these large, highly parallel sets of experimental observations also emphasize important technical and experimental design issues as future challenges.  相似文献   

18.
MiST is a novel approach to variant calling from deep sequencing data, using the inverted mapping approach developed for Geoseq. Reads that can map to a targeted exonic region are identified using exact matches to tiles from the region. The reads are then aligned to the targets to discover variants. MiST carefully handles paralogous reads that map ambiguously to the genome and clonal reads arising from PCR bias, which are the two major sources of errors in variant calling. The reduced computational complexity of mapping selected reads to targeted regions of the genome improves speed, specificity and sensitivity of variant detection. Compared with variant calls from the GATK platform, MiST showed better concordance with SNPs from dbSNP and genotypes determined by an exonic-SNP array. Variant calls made only by MiST confirm at a high rate (>90%) by Sanger sequencing. Thus, MiST is a valuable alternative tool to analyse variants in deep sequencing data.  相似文献   

19.
MOTIVATION: Chromosomal copy number changes (aneuploidies) are common in cell populations that undergo multiple cell divisions including yeast strains, cell lines and tumor cells. Identification of aneuploidies is critical in evolutionary studies, where changes in copy number serve an adaptive purpose, as well as in cancer studies, where amplifications and deletions of chromosomal regions have been identified as a major pathogenetic mechanism. Aneuploidies can be studied on whole-genome level using array CGH (a microarray-based method that measures the DNA content), but their presence also affects gene expression. In gene expression microarray analysis, identification of copy number changes is especially important in preventing aberrant biological conclusions based on spurious gene expression correlation or masked phenotypes that arise due to aneuploidies. Previously suggested approaches for aneuploidy detection from microarray data mostly focus on array CGH, address only whole-chromosome or whole-arm copy number changes, and rely on thresholds or other heuristics, making them unsuitable for fully automated general application to gene expression datasets. There is a need for a general and robust method for identification of aneuploidies of any size from both array CGH and gene expression microarray data. RESULTS: We present ChARM (Chromosomal Aberration Region Miner), a robust and accurate expectation-maximization based method for identification of segmental aneuploidies (partial chromosome changes) from gene expression and array CGH microarray data. Systematic evaluation of the algorithm on synthetic and biological data shows that the method is robust to noise, aneuploidal segment size and P-value cutoff. Using our approach, we identify known chromosomal changes and predict novel potential segmental aneuploidies in commonly used yeast deletion strains and in breast cancer. ChARM can be routinely used to identify aneuploidies in array CGH datasets and to screen gene expression data for aneuploidies or array biases. Our methodology is sensitive enough to detect statistically significant and biologically relevant aneuploidies even when expression or DNA content changes are subtle as in mixed populations of cells. AVAILABILITY: Code available by request from the authors and on Web supplement at http://function.cs.princeton.edu/ChARM/  相似文献   

20.
Well-defined relationships between oligonucleotide properties and hybridization signal intensities (HSI) can aid chip design, data normalization and true biological knowledge discovery. We clarify these relationships using the data from two microarray experiments containing over three million probes from 48 high-density chips. We find that melting temperature (Tm) has the most significant effect on HSI while length for the long oligonucleotides studied has very little effect. Analysis of positional effect using a linear model provides evidence that the protruding ends of probes contribute more than tethered ends to HSI, which is further validated by specifically designed match fragment sliding and extension experiments. The impact of sequence similarity (SeqS) on HSI is not significant in comparison with other oligonucleotide properties. Using regression and regression tree analysis, we prioritize these oligonucleotide properties based on their effects on HSI. The implications of our discoveries for the design of unbiased oligonucleotides are discussed. We propose that isothermal probes designed by varying the length is a viable strategy to reduce sequence bias, though imposing selection constraints on other oligonucleotide properties is also essential.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号