共查询到20条相似文献,搜索用时 15 毫秒
1.
Florian Hahne Alexander Mehrle Dorit Arlt Annemarie Poustka Stefan Wiemann Tim Beissbarth 《BMC bioinformatics》2008,9(1):1-7
Background
High-throughput technologies like functional screens and gene expression analysis produce extended lists of candidate genes. Gene-Set Enrichment Analysis is a commonly used and well established technique to test for the statistically significant over-representation of particular pathways. A shortcoming of this method is however, that most genes that are investigated in the experiments have very sparse functional or pathway annotation and therefore cannot be the target of such an analysis. The approach presented here aims to assign lists of genes with limited annotation to previously described functional gene collections or pathways. This works by comparing InterPro domain signatures of the candidate gene lists with domain signatures of gene sets derived from known classifications, e.g. KEGG pathways.Results
In order to validate our approach, we designed a simulation study. Based on all pathways available in the KEGG database, we create test gene lists by randomly selecting pathway genes, removing these genes from the known pathways and adding variable amounts of noise in the form of genes not annotated to the pathway. We show that we can recover pathway memberships based on the simulated gene lists with high accuracy. We further demonstrate the applicability of our approach on a biological example.Conclusion
Results based on simulation and data analysis show that domain based pathway enrichment analysis is a very sensitive method to test for enrichment of pathways in sparsely annotated lists of genes. An R based software package domainsignatures, to routinely perform this analysis on the results of high-throughput screening, is available via Bioconductor. 相似文献2.
Background
The Gene Ontology (GO) is used to describe genes and gene products from many organisms. When used for functional annotation of microarray data, GO is often slimmed by editing so that only higher level terms remain. This practice is designed to improve the summarizing of experimental results by grouping high level terms and the statistical power of GO term enrichment analysis. 相似文献3.
Oliver Laule Matthias Hirsch-Hoffmann Tomas Hruz Wilhelm Gruissem Philip Zimmermann 《BMC bioinformatics》2006,7(1):311-8
Background
Gene function analysis often requires a complex and laborious sequence of laboratory and computer-based experiments. Choosing an effective experimental design generally results from hypotheses derived from prior knowledge or experimentation. Knowledge obtained from meta-analyzing compendia of expression data with annotation libraries can provide significant clues in understanding gene and network function, resulting in better hypotheses that can be tested in the laboratory. 相似文献4.
5.
Background
Numerous feature selection methods have been applied to the identification of differentially expressed genes in microarray data. These include simple fold change, classical t-statistic and moderated t-statistics. Even though these methods return gene lists that are often dissimilar, few direct comparisons of these exist. We present an empirical study in which we compare some of the most commonly used feature selection methods. We apply these to 9 publicly available datasets, and compare, both the gene lists produced and how these perform in class prediction of test datasets. 相似文献6.
7.
Background
Genes that are determined to be significantly differentially regulated in microarray analyses often appear to have functional commonalities, such as being components of the same biochemical pathway. This results in certain words being under- or overrepresented in the list of genes. Distinguishing between biologically meaningful trends and artifacts of annotation and analysis procedures is of the utmost importance, as only true biological trends are of interest for further experimentation. A number of sophisticated methods for identification of significant lexical trends are currently available, but these methods are generally too cumbersome for practical use by most microarray users. 相似文献8.
Background
Since the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on simulations or on union-bound correction for assigning statistical significance to the results. 相似文献9.
Background
Abundant information about gene products is stored in online searchable databases such as annotation or literature. To efficiently obtain and digest such information, there is a pressing need for automated information-summarization and functional-similarity clustering of genes. 相似文献10.
Brad T Sherman Da Wei Huang Qina Tan Yongjian Guo Stephan Bour David Liu Robert Stephens Michael W Baseler H Clifford Lane Richard A Lempicki 《BMC bioinformatics》2007,8(1):426
Background
Due to the complex and distributed nature of biological research, our current biological knowledge is spread over many redundant annotation databases maintained by many independent groups. Analysts usually need to visit many of these bioinformatics databases in order to integrate comprehensive annotation information for their genes, which becomes one of the bottlenecks, particularly for the analytic task associated with a large gene list. Thus, a highly centralized and ready-to-use gene-annotation knowledgebase is in demand for high throughput gene functional analysis. 相似文献11.
Background
One of the challenges in the analysis of microarray data is to integrate and compare the selected (e.g., differential) gene lists from multiple experiments for common or unique underlying biological themes. A common way to approach this problem is to extract common genes from these gene lists and then subject these genes to enrichment analysis to reveal the underlying biology. However, the capacity of this approach is largely restricted by the limited number of common genes shared by datasets from multiple experiments, which could be caused by the complexity of the biological system itself. 相似文献12.
GW Patton R Stephens IA Sidorov X Xiao RA Lempicki DS Dimitrov RH Shoemaker G Tudor 《BMC bioinformatics》2006,7(1):81
Background
Microarrays used for gene expression studies yield large amounts of data. The processing of such data typically leads to lists of differentially-regulated genes. A common terminal data analysis step is to map pathways of potentially interrelated genes. 相似文献13.
Background
Systematic genome comparisons are an important tool to reveal gene functions, pathogenic features, metabolic pathways and genome evolution in the era of post-genomics. Furthermore, such comparisons provide important clues for vaccines and drug development. Existing genome comparison software often lacks accurate information on orthologs, the function of similar genes identified and genome-wide reports and lists on specific functions. All these features and further analyses are provided here in the context of a modular software tool "inGeno" written in Java with Biojava subroutines. 相似文献14.
Oliver Keller Florian Odronitz Mario Stanke Martin Kollmar Stephan Waack 《BMC bioinformatics》2008,9(1):278
Background
For many types of analyses, data about gene structure and locations of non-coding regions of genes are required. Although a vast amount of genomic sequence data is available, precise annotation of genes is lacking behind. Finding the corresponding gene of a given protein sequence by means of conventional tools is error prone, and cannot be completed without manual inspection, which is time consuming and requires considerable experience. 相似文献15.
Jose C Jimenez-Lopez Emma W Gachomo Manfredo J Seufferheld Simeon O Kotchoni 《BMC structural biology》2010,10(1):43
Background
The completion of maize genome sequencing has resulted in the identification of a large number of uncharacterized genes. Gene annotation and functional characterization of gene products are important to uncover novel protein functionality. 相似文献16.
17.
Background
Ranked gene lists from microarray experiments are usually analysed by assigning significance to predefined gene categories, e.g., based on functional annotations. Tools performing such analyses are often restricted to a category score based on a cutoff in the ranked list and a significance calculation based on random gene permutations as null hypothesis. 相似文献18.
19.
Background
Different microarray studies have compiled gene lists for predicting outcomes of a range of treatments and diseases. These have produced gene lists that have little overlap, indicating that the results from any one study are unstable. It has been suggested that the underlying pathways are essentially identical, and that the expression of gene sets, rather than that of individual genes, may be more informative with respect to prognosis and understanding of the underlying biological process. 相似文献20.