期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

BiNChE: A web tool and library for chemical enrichment analysis based on the ChEBI ontology

Pablo Moreno Stephan Beisken Bhavana Harsha Venkatesh Muthukrishnan Ilinca Tudose Adriano Dekker Stefanie Dornfeldt Franziska Taruttis Ivo Grosse Janna Hastings Steffen Neumann Christoph Steinbeck 《BMC bioinformatics》2015,16(1)

Background

Ontology-based enrichment analysis aids in the interpretation and understanding of large-scale biological data. Ontologies are hierarchies of biologically relevant groupings. Using ontology annotations, which link ontology classes to biological entities, enrichment analysis methods assess whether there is a significant over or under representation of entities for ontology classes. While many tools exist that run enrichment analysis for protein sets annotated with the Gene Ontology, there are only a few that can be used for small molecules enrichment analysis.

Results

We describe BiNChE, an enrichment analysis tool for small molecules based on the ChEBI Ontology. BiNChE displays an interactive graph that can be exported as a high-resolution image or in network formats. The tool provides plain, weighted and fragment analysis based on either the ChEBI Role Ontology or the ChEBI Structural Ontology.

Conclusions

BiNChE aids in the exploration of large sets of small molecules produced within Metabolomics or other Systems Biology research contexts. The open-source tool provides easy and highly interactive web access to enrichment analysis with the ChEBI ontology tool and is additionally available as a standalone library.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0486-3) contains supplementary material, which is available to authorized users. 相似文献

2.

Hidden localization motifs: naturally occurring peroxisomal targeting signals in non-peroxisomal proteins

Neuberger G Kunze M Eisenhaber F Berger J Hartig A Brocard C 《Genome biology》2004,5(12):R97

Background

Can sequence segments coding for subcellular targeting or for posttranslational modifications occur in proteins that are not substrates in either of these processes? Although considerable effort has been invested in achieving low false-positive prediction rates, even accurate sequence-analysis tools for the recognition of these motifs generate a small but noticeable number of protein hits that lack the appropriate biological context but cannot be rationalized as false positives.

Results

We show that the carboxyl termini of a set of definitely non-peroxisomal proteins with predicted peroxisomal targeting signals interact with the peroxisomal matrix protein receptor peroxin 5 (PEX5) in a yeast two-hybrid test. Moreover, we show that examples of these proteins - chicken lysozyme, human tyrosinase and the yeast mitochondrial ribosomal protein L2 (encoded by MRP7) - are imported into peroxisomes in vivo if their original sorting signals are disguised. We also show that even prokaryotic proteins can contain peroxisomal targeting sequences.

Conclusions

Thus, functional localization signals can evolve in unrelated protein sequences as a result of neutral mutations, and subcellular targeting is hierarchically organized, with signal accessibility playing a decisive role. The occurrence of silent functional motifs in unrelated proteins is important for the development of sequence-based function prediction tools and the interpretation of their results. Silent functional signals have the potential to acquire importance in future evolutionary scenarios and in pathological conditions. 相似文献

3.

IPred - integrating ab initio and evidence based gene predictions to improve prediction accuracy

Franziska Zickmann Bernhard Y Renard 《BMC genomics》2015,16(1)

Background

Gene prediction is a challenging but crucial part in most genome analysis pipelines. Various methods have evolved that predict genes ab initio on reference sequences or evidence based with the help of additional information, such as RNA-Seq reads or EST libraries. However, none of these strategies is bias-free and one method alone does not necessarily provide a complete set of accurate predictions.

Results

We present IPred (Integrative gene Prediction), a method to integrate ab initio and evidence based gene identifications to complement the advantages of different prediction strategies. IPred builds on the output of gene finders and generates a new combined set of gene identifications, representing the integrated evidence of the single method predictions.

Conclusion

We evaluate IPred in simulations and real data experiments on Escherichia Coli and human data. We show that IPred improves the prediction accuracy in comparison to single method predictions and to existing methods for prediction combination.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1315-9) contains supplementary material, which is available to authorized users. 相似文献

4.

QServer: a biclustering server for prediction and assessment of co-expressed gene clusters

Zhou F Ma Q Li G Xu Y 《PloS one》2012,7(3):e32660

相似文献

5.

Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity

Dokyun Na Hyungbin Son J?rg Gsponer 《BMC genomics》2014,15(1)

Background

Communalities between large sets of genes obtained from high-throughput experiments are often identified by searching for enrichments of genes with the same Gene Ontology (GO) annotations. The GO analysis tools used for these enrichment analyses assume that GO terms are independent and the semantic distances between all parent–child terms are identical, which is not true in a biological sense. In addition these tools output lists of often redundant or too specific GO terms, which are difficult to interpret in the context of the biological question investigated by the user. Therefore, there is a demand for a robust and reliable method for gene categorization and enrichment analysis.

Results

We have developed Categorizer, a tool that classifies genes into user-defined groups (categories) and calculates p-values for the enrichment of the categories. Categorizer identifies the biologically best-fit category for each gene by taking advantage of a specialized semantic similarity measure for GO terms. We demonstrate that Categorizer provides improved categorization and enrichment results of genetic modifiers of Huntington’s disease compared to a classical GO Slim-based approach or categorizations using other semantic similarity measures.

Conclusion

Categorizer enables more accurate categorizations of genes than currently available methods. This new tool will help experimental and computational biologists analyzing genomic and proteomic data according to their specific needs in a more reliable manner. 相似文献

6.

Consensus pathways implicated in prognosis of colorectal cancer identified through systematic enrichment analysis of gene expression profiling studies

Lascorz J Chen B Hemminki K Försti A 《PloS one》2011,6(4):e18867

Background

A large number of gene expression profiling (GEP) studies on prognosis of colorectal cancer (CRC) has been performed, but no reliable gene signature for prediction of CRC prognosis has been found. Bioinformatic enrichment tools are a powerful approach to identify biological processes in high-throughput data analysis.

Principal Findings

We have for the first time collected the results from the 23 so far published independent GEP studies on CRC prognosis. In these 23 studies, 1475 unique, mapped genes were identified, from which 124 (8.4%) were reported in at least two studies, with 54 of them showing consisting direction in expression change between the single studies. Using these data, we attempted to overcome the lack of reproducibility observed in the genes reported in individual GEP studies by carrying out a pathway-based enrichment analysis. We used up to ten tools for overrepresentation analysis of Gene Ontology (GO) categories or Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways in each of the three gene lists (1475, 124 and 54 genes). This strategy, based on testing multiple tools, allowed us to identify the oxidative phosphorylation chain and the extracellular matrix receptor interaction categories, as well as a general category related to cell proliferation and apoptosis, as the only significantly and consistently overrepresented pathways in the three gene lists, which were reported by several enrichment tools.

Conclusions

Our pathway-based enrichment analysis of 23 independent gene expression profiling studies on prognosis of CRC identified significantly and consistently overrepresented prognostic categories for CRC. These overrepresented categories have been functionally clearly related with cancer progression, and deserve further investigation. 相似文献

7.

A shortcut for multiple testing on the directed acyclic graph of gene ontology

Garrett Saunders John R Stevens S Clay Isom 《BMC bioinformatics》2014,15(1)

Background

Gene set testing has become an important analysis technique in high throughput microarray and next generation sequencing studies for uncovering patterns of differential expression of various biological processes. Often, the large number of gene sets that are tested simultaneously require some sort of multiplicity correction to account for the multiplicity effect. This work provides a substantial computational improvement to an existing familywise error rate controlling multiplicity approach (the Focus Level method) for gene set testing in high throughput microarray and next generation sequencing studies using Gene Ontology graphs, which we call the Short Focus Level.

Results

The Short Focus Level procedure, which performs a shortcut of the full Focus Level procedure, is achieved by extending the reach of graphical weighted Bonferroni testing to closed testing situations where restricted hypotheses are present, such as in the Gene Ontology graphs. The Short Focus Level multiplicity adjustment can perform the full top-down approach of the original Focus Level procedure, overcoming a significant disadvantage of the otherwise powerful Focus Level multiplicity adjustment. The computational and power differences of the Short Focus Level procedure as compared to the original Focus Level procedure are demonstrated both through simulation and using real data.

Conclusions

The Short Focus Level procedure shows a significant increase in computation speed over the original Focus Level procedure (as much as ∼15,000 times faster). The Short Focus Level should be used in place of the Focus Level procedure whenever the logical assumptions of the Gene Ontology graph structure are appropriate for the study objectives and when either no a priori focus level of interest can be specified or the focus level is selected at a higher level of the graph, where the Focus Level procedure is computationally intractable.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0349-3) contains supplementary material, which is available to authorized users. 相似文献

8.

Analysis of Altered MicroRNA Expression Profiles in Proximal Renal Tubular Cells in Response to Calcium Oxalate Monohydrate Crystal Adhesion: Implications for Kidney Stone Disease

Bohan Wang Bolin Wu Jun Liu Weimin Yao Ding Xia Lu Li Zhiqiang Chen Zhangqun Ye Xiao Yu 《PloS one》2014,9(7)

相似文献

9.

OrthoList: a compendium of C. elegans genes with human orthologs

Shaye DD Greenwald I 《PloS one》2011,6(5):e20085

相似文献

10.

De novo assembly of the Carcinus maenas transcriptome and characterization of innate immune system pathways

Bas Verbruggen Lisa K. Bickley Eduarda M. Santos Charles R. Tyler Grant D. Stentiford Kelly S. Bateman Ronny van Aerle 《BMC genomics》2015,16(1)

相似文献

11.

PPI Finder: A Mining Tool for Human Protein-Protein Interactions

Min He Yi Wang Wei Li 《PloS one》2009,4(2)

Background

The exponential increase of published biomedical literature prompts the use of text mining tools to manage the information overload automatically. One of the most common applications is to mine protein-protein interactions (PPIs) from PubMed abstracts. Currently, most tools in mining PPIs from literature are using co-occurrence-based approaches or rule-based approaches. Hybrid methods (frame-based approaches) by combining these two methods may have better performance in predicting PPIs. However, the predicted PPIs from these methods are rarely evaluated by known PPI databases and co-occurred terms in Gene Ontology (GO) database.

Methodology/Principal Findings

We here developed a web-based tool, PPI Finder, to mine human PPIs from PubMed abstracts based on their co-occurrences and interaction words, followed by evidences in human PPI databases and shared terms in GO database. Only 28% of the co-occurred pairs in PubMed abstracts appeared in any of the commonly used human PPI databases (HPRD, BioGRID and BIND). On the other hand, of the known PPIs in HPRD, 69% showed co-occurrences in the literature, and 65% shared GO terms.

Conclusions

PPI Finder provides a useful tool for biologists to uncover potential novel PPIs. It is freely accessible at http://liweilab.genetics.ac.cn/tm/. 相似文献

12.

Using Gene Ontology to describe the role of the neurexin-neuroligin-SHANK complex in human,mouse and rat and its relevance to autism

Sejal Patel Paola Roncaglia Ruth C. Lovering 《BMC bioinformatics》2015,16(1)

相似文献

13.

Predicted protein-protein interactions in the moss Physcomitrella patens: a new bioinformatic resource

Scott Schuette Brian Piatkowski Aaron Corley Daniel Lang Matt Geisler 《BMC bioinformatics》2015,16(1)

Background

Physcomitrella patens, a haploid dominant plant, is fast becoming a useful molecular genetics and bioinformatics tool due to its key phylogenetic position as a bryophyte in the post-genomic era. Genome sequences from select reference species were compared bioinformatically to Physcomitrella patens using reciprocal blasts with the InParanoid software package. A reference protein interaction database assembled using MySQL by compiling BioGrid, BIND, DIP, and Intact databases was queried for moss orthologs existing for both interacting partners. This method has been used to successfully predict interactions for a number of angiosperm plants.

Results

The first predicted protein-protein interactome for a bryophyte based on the interolog method contains 67,740 unique interactions from 5,695 different Physcomitrella patens proteins. Most conserved interactions among proteins were those associated with metabolic processes. Over-represented Gene Ontology categories are reported here.

Conclusion

Addition of moss, a plant representative 200 million years diverged from angiosperms to interactomic research greatly expands the possibility of conducting comparative analyses giving tremendous insight into network evolution of land plants. This work helps demonstrate the utility of “guilt-by-association” models for predicting protein interactions, providing provisional roadmaps that can be explored using experimental approaches. Included with this dataset is a method for characterizing subnetworks and investigating specific processes, such as the Calvin-Benson-Bassham cycle.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0524-1) contains supplementary material, which is available to authorized users. 相似文献

14.

Sequence signatures extracted from proximal promoters can be used to predict distal enhancers

Leila Taher Robin P Smith Mee J Kim Nadav Ahituv Ivan Ovcharenko 《Genome biology》2013,14(10):R117

相似文献

15.

The transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) male reproductive organs

Azevedo RV Dias DB Bretãs JA Mazzoni CJ Souza NA Albano RM Wagner G Davila AM Peixoto AA 《PloS one》2012,7(4):e34495

相似文献

16.

Partitioning of Minimotifs Based on Function with Improved Prediction Accuracy

Sanguthevar Rajasekaran Tian Mi Jerlin Camilus Merlin Aaron Oommen Patrick Gradie Martin R. Schiller 《PloS one》2010,5(8)

Background

Minimotifs are short contiguous peptide sequences in proteins that are known to have a function in at least one other protein. One of the principal limitations in minimotif prediction is that false positives limit the usefulness of this approach. As a step toward resolving this problem we have built, implemented, and tested a new data-driven algorithm that reduces false-positive predictions.

Methodology/Principal Findings

Certain domains and minimotifs are known to be strongly associated with a known cellular process or molecular function. Therefore, we hypothesized that by restricting minimotif predictions to those where the minimotif containing protein and target protein have a related cellular or molecular function, the prediction is more likely to be accurate. This filter was implemented in Minimotif Miner using function annotations from the Gene Ontology. We have also combined two filters that are based on entirely different principles and this combined filter has a better predictability than the individual components.

Conclusions/Significance

Testing these functional filters on known and random minimotifs has revealed that they are capable of separating true motifs from false positives. In particular, for the cellular function filter, the percentage of known minimotifs that are not removed by the filter is ∼4.6 times that of random minimotifs. For the molecular function filter this ratio is ∼2.9. These results, together with the comparison with the published frequency score filter, strongly suggest that the new filters differentiate true motifs from random background with good confidence. A combination of the function filters and the frequency score filter performs better than these two individual filters. 相似文献

17.

Reproducibility enhancement and differential expression of non predefined functional gene sets in human genome

Samoel RM da Silva Gabriel C Perrone Jo?o M Dinis Rita MC de Almeida 《BMC genomics》2014,15(1)

相似文献

18.

An Integrative Approach to Inferring Gene Regulatory Module Networks

Michael Baitaluk Sergey Kozhenkov Julia Ponomarenko 《PloS one》2012,7(12)

相似文献

19.

Molecular tools to support metabolic and immune function research in the Guinea Fowl (Numida meleagris)

Carl E Darris James E Tyus Gary Kelley Alexander J Ropelewski Hugh B Nicholas Jr Xiaofei Wang Samuel Nahashon 《BMC genomics》2015,16(1)

相似文献

20.

FastBLAST: homology relationships for millions of proteins

Price MN Dehal PS Arkin AP 《PloS one》2008,3(10):e3589

Background

All-versus-all BLAST, which searches for homologous pairs of sequences in a database of proteins, is used to identify potential orthologs, to find new protein families, and to provide rapid access to these homology relationships. As DNA sequencing accelerates and data sets grow, all-versus-all BLAST has become computationally demanding.

Methodology/Principal Findings

We present FastBLAST, a heuristic replacement for all-versus-all BLAST that relies on alignments of proteins to known families, obtained from tools such as PSI-BLAST and HMMer. FastBLAST avoids most of the work of all-versus-all BLAST by taking advantage of these alignments and by clustering similar sequences. FastBLAST runs in two stages: the first stage identifies additional families and aligns them, and the second stage quickly identifies the homologs of a query sequence, based on the alignments of the families, before generating pairwise alignments. On 6.53 million proteins from the non-redundant Genbank database (“NR”), FastBLAST identifies new families 25 times faster than all-versus-all BLAST. Once the first stage is completed, FastBLAST identifies homologs for the average query in less than 5 seconds (8.6 times faster than BLAST) and gives nearly identical results. For hits above 70 bits, FastBLAST identifies 98% of the top 3,250 hits per query.

Conclusions/Significance

FastBLAST enables research groups that do not have supercomputers to analyze large protein sequence data sets. FastBLAST is open source software and is available at http://microbesonline.org/fastblast. 相似文献