期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Computational evaluation of TIS annotation for prokaryotic genomes

Gang-Qing Hu Xiaobin Zheng Li-Ning Ju Huaiqiu Zhu Zhen-Su She 《BMC bioinformatics》2008,9(1):160

Background

Accurate annotation of translation initiation sites (TISs) is essential for understanding the translation initiation mechanism. However, the reliability of TIS annotation in widely used databases such as RefSeq is uncertain due to the lack of experimental benchmarks. 相似文献

2.

IntelliGO: a new vector-based semantic similarity measure including annotation origin

Sidahmed Benabderrahmane Malika Smail-Tabbone Olivier Poch Amedeo Napoli Marie-Dominique Devignes 《BMC bioinformatics》2010,11(1):588

Background

The Gene Ontology (GO) is a well known controlled vocabulary describing the biological process, molecular function and cellular component aspects of gene annotation. It has become a widely used knowledge source in bioinformatics for annotating genes and measuring their semantic similarity. These measures generally involve the GO graph structure, the information content of GO aspects, or a combination of both. However, only a few of the semantic similarity measures described so far can handle GO annotations differently according to their origin (i.e. their evidence codes). 相似文献

3.

BioSAVE: Display of scored annotation within a sequence context

Richard F Pollock Boris Adryan 《BMC bioinformatics》2008,9(1):157

Background

Visualization of sequence annotation is a common feature in many bioinformatics tools. For many applications it is desirable to restrict the display of such annotation according to a score cutoff, as biological interpretation can be difficult in the presence of the entire data. Unfortunately, many visualisation solutions are somewhat static in the way they handle such score cutoffs. 相似文献

4.

PoGO: Prediction of Gene Ontology terms for fungal proteins

Jaehee Jung Gangman Yi Serenella A Sukno Michael R Thon 《BMC bioinformatics》2010,11(1):215

Background

Automated protein function prediction methods are the only practical approach for assigning functions to genes obtained from model organisms. Many of the previously reported function annotation methods are of limited utility for fungal protein annotation. They are often trained only to one species, are not available for high-volume data processing, or require the use of data derived by experiments such as microarray analysis. To meet the increasing need for high throughput, automated annotation of fungal genomes, we have developed a tool for annotating fungal protein sequences with terms from the Gene Ontology. 相似文献

5.

Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach

Carson?Andorf Drena?Dobbs Vasant?Honavar Email author 《BMC bioinformatics》2007,8(1):284

Background

Incorrectly annotated sequence data are becoming more commonplace as databases increasingly rely on automated techniques for annotation. Hence, there is an urgent need for computational methods for checking consistency of such annotations against independent sources of evidence and detecting potential annotation errors. We show how a machine learning approach designed to automatically predict a protein's Gene Ontology (GO) functional class can be employed to identify potential gene annotation errors. 相似文献

6.

Applying negative rule mining to improve genome annotation

Irena I Artamonova Goar Frishman Dmitrij Frishman 《BMC bioinformatics》2007,8(1):261

Background

Unsupervised annotation of proteins by software pipelines suffers from very high error rates. Spurious functional assignments are usually caused by unwarranted homology-based transfer of information from existing database entries to the new target sequences. We have previously demonstrated that data mining in large sequence annotation databanks can help identify annotation items that are strongly associated with each other, and that exceptions from strong positive association rules often point to potential annotation errors. Here we investigate the applicability of negative association rule mining to revealing erroneously assigned annotation items. 相似文献

7.

GeneLibrarian: an effective gene-information summarization and visualization system

Jung-Hsien Chiang Jyh-Wei Shin Heng-Hui Liu Chong-Liang Chin 《BMC bioinformatics》2006,7(1):392

Background

Abundant information about gene products is stored in online searchable databases such as annotation or literature. To efficiently obtain and digest such information, there is a pressing need for automated information-summarization and functional-similarity clustering of genes. 相似文献

8.

Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: High-resolution annotation for microarrays

Jun Lu Joseph C Lee Marc L Salit Margaret C Cam 《BMC bioinformatics》2007,8(1):108

Background

Extracting biological information from high-density Affymetrix arrays is a multi-step process that begins with the accurate annotation of microarray probes. Shortfalls in the original Affymetrix probe annotation have been described; however, few studies have provided rigorous solutions for routine data analysis. 相似文献

9.

A fast structural multiple alignment method for long RNA sequences

Yasuo Tabei Hisanori Kiryu Taishin Kin Kiyoshi Asai 《BMC bioinformatics》2008,9(1):1-17

Background

Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data.

Results

In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general.

Conclusion

GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values. 相似文献

10.

IDconverter and IDClight: Conversion and annotation of gene and protein IDs

Andreu Alibés Patricio Yankilevich Andrés Cañada Ramón Díaz-Uriarte 《BMC bioinformatics》2007,8(1):9

Background

Researchers involved in the annotation of large numbers of gene, clone or protein identifiers are usually required to perform a one-by-one conversion for each identifier. When the field of research is one such as microarray experiments, this number may be around 30,000. 相似文献

11.

Accessing the SEED Genome Databases via Web Services API: Tools for Programmers

Terry Disz Sajia Akhter Daniel Cuevas Robert Olson Ross Overbeek Veronika Vonstein Rick Stevens Robert A Edwards 《BMC bioinformatics》2010,11(1):319

Background

The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups. 相似文献

12.

EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries

Robin P Smith William J Buchser Marcus B Lemmon Jose R Pardinas John L Bixby Vance P Lemmon 《BMC bioinformatics》2008,9(1):186

Background

Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. 相似文献

13.

Integration of metabolic databases for the reconstruction of genome-scale metabolic networks

Karin Radrich Yoshimasa Tsuruoka Paul Dobson Albert Gevorgyan Neil Swainston Gino Baart Jean-Marc Schwartz 《BMC systems biology》2010,4(1):114

Background

Genome-scale metabolic reconstructions have been recognised as a valuable tool for a variety of applications ranging from metabolic engineering to evolutionary studies. However, the reconstruction of such networks remains an arduous process requiring a high level of human intervention. This process is further complicated by occurrences of missing or conflicting information and the absence of common annotation standards between different data sources. 相似文献

14.

GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes

David?MA?Martin Email author Matthew?Berriman Geoffrey?J?Barton 《BMC bioinformatics》2004,5(1):178

Background

The function of a novel gene product is typically predicted by transitive assignment of annotation from similar sequences. We describe a novel method, GOtcha, for predicting gene product function by annotation with Gene Ontology (GO) terms. GOtcha predicts GO term associations with term-specific probability (P-score) measures of confidence. Term-specific probabilities are a novel feature of GOtcha and allow the identification of conflicts or uncertainty in annotation. 相似文献

15.

Splign: algorithms for computing spliced alignments with identification of paralogs

Yuri Kapustin Alexander Souvorov Tatiana Tatusova David Lipman 《Biology direct》2008,3(1):20

Background

The computation of accurate alignments of cDNA sequences against a genome is at the foundation of modern genome annotation pipelines. Several factors such as presence of paralogs, small exons, non-consensus splice signals, sequencing errors and polymorphic sites pose recognized difficulties to existing spliced alignment algorithms. 相似文献

16.

Functional discrimination of membrane proteins using machine learning techniques

M Michael Gromiha Yukimitsu Yabuki 《BMC bioinformatics》2008,9(1):135

Background

Discriminating membrane proteins based on their functions is an important task in genome annotation. In this work, we have analyzed the characteristic features of amino acid residues in membrane proteins that perform major functions, such as channels/pores, electrochemical potential-driven transporters and primary active transporters. 相似文献

17.

BibGlimpse: The case for a light-weight reprint manager in distributed literature research

Thomas Tüchler Golda Velez Alexandra Graf David P Kreil 《BMC bioinformatics》2008,9(1):406

Background

While text-mining and distributed annotation systems both aim at capturing knowledge and presenting it in a standardized form, there have been few attempts to investigate potential synergies between these two fields. For instance, distributed annotation would be very well suited for providing topic focussed, expert knowledge enriched text corpora. A key limitation for this approach is the availability of literature annotation systems that can be routinely used by groups of collaborating researchers on a day to day basis, not distracting from the main focus of their work. 相似文献

18.

The comparative analysis of statistics,based on the likelihood ratio criterion,in the automated annotation problem

Andrey M Leontovich Konstantin Y Tokmachev Hans C van Houwelingen 《BMC bioinformatics》2008,9(1):31

Background

This paper discusses the problem of automated annotation. It is a continuation of the previous work on the A⁴-algorithm (Adaptive algorithm of automated annotation) developed by Leontovich and others. 相似文献

19.

Genepi: a blackboard framework for genome annotation

Stéphane Descorps-Declère Danielle Ziébelin François Rechenmann Alain Viari 《BMC bioinformatics》2006,7(1):450-13

Background

Genome annotation can be viewed as an incremental, cooperative, data-driven, knowledge-based process that involves multiple methods to predict gene locations and structures. This process might have to be executed more than once and might be subjected to several revisions as the biological (new data) or methodological (new methods) knowledge evolves. In this context, although a lot of annotation platforms already exist, there is still a strong need for computer systems which take in charge, not only the primary annotation, but also the update and advance of the associated knowledge. In this paper, we propose to adopt a blackboard architecture for designing such a system 相似文献

20.

An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology

Shobhit Jain Gary D Bader 《BMC bioinformatics》2010,11(1):562

Background

Semantic similarity measures are useful to assess the physiological relevance of protein-protein interactions (PPIs). They quantify similarity between proteins based on their function using annotation systems like the Gene Ontology (GO). Proteins that interact in the cell are likely to be in similar locations or involved in similar biological processes compared to proteins that do not interact. Thus the more semantically similar the gene function annotations are among the interacting proteins, more likely the interaction is physiologically relevant. However, most semantic similarity measures used for PPI confidence assessment do not consider the unequal depth of term hierarchies in different classes of cellular location, molecular function, and biological process ontologies of GO and thus may over-or under-estimate similarity. 相似文献