期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes

David?MA?Martin Email author Matthew?Berriman Geoffrey?J?Barton 《BMC bioinformatics》2004,5(1):178

Background

The function of a novel gene product is typically predicted by transitive assignment of annotation from similar sequences. We describe a novel method, GOtcha, for predicting gene product function by annotation with Gene Ontology (GO) terms. GOtcha predicts GO term associations with term-specific probability (P-score) measures of confidence. Term-specific probabilities are a novel feature of GOtcha and allow the identification of conflicts or uncertainty in annotation. 相似文献

2.

A new measure for functional similarity of gene products based on Gene Ontology

Andreas Schlicker Francisco S Domingues Jörg Rahnenführer Thomas Lengauer 《BMC bioinformatics》2006,7(1):302-16

Background

Gene Ontology (GO) is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role. 相似文献

3.

Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach

Carson?Andorf Drena?Dobbs Vasant?Honavar Email author 《BMC bioinformatics》2007,8(1):284

Background

Incorrectly annotated sequence data are becoming more commonplace as databases increasingly rely on automated techniques for annotation. Hence, there is an urgent need for computational methods for checking consistency of such annotations against independent sources of evidence and detecting potential annotation errors. We show how a machine learning approach designed to automatically predict a protein's Gene Ontology (GO) functional class can be employed to identify potential gene annotation errors. 相似文献

4.

Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model

Xin He Moushumi Sen Sarma Xu Ling Brant Chee Chengxiang Zhai Bruce Schatz 《BMC bioinformatics》2010,11(1):272

Background

Large-scale genomic studies often identify large gene lists, for example, the genes sharing the same expression patterns. The interpretation of these gene lists is generally achieved by extracting concepts overrepresented in the gene lists. This analysis often depends on manual annotation of genes based on controlled vocabularies, in particular, Gene Ontology (GO). However, the annotation of genes is a labor-intensive process; and the vocabularies are generally incomplete, leaving some important biological domains inadequately covered. 相似文献

5.

The maize ALDH protein superfamily: linking structural features to functional specificities

Jose C Jimenez-Lopez Emma W Gachomo Manfredo J Seufferheld Simeon O Kotchoni 《BMC structural biology》2010,10(1):43

Background

The completion of maize genome sequencing has resulted in the identification of a large number of uncharacterized genes. Gene annotation and functional characterization of gene products are important to uncover novel protein functionality. 相似文献

6.

The Neural/Immune Gene Ontology: clipping the Gene Ontology for neurological and immunological systems

Nophar Geifman Alon Monsonego Eitan Rubin 《BMC bioinformatics》2010,11(1):458

Background

The Gene Ontology (GO) is used to describe genes and gene products from many organisms. When used for functional annotation of microarray data, GO is often slimmed by editing so that only higher level terms remain. This practice is designed to improve the summarizing of experimental results by grouping high level terms and the statistical power of GO term enrichment analysis. 相似文献

7.

Context-driven discovery of gene cassettes in mobile integrons using a computational grammar

Guy Tsafnat Enrico Coiera Sally R Partridge Jaron Schaeffer Jon R Iredell 《BMC bioinformatics》2009,10(1):281

相似文献

8.

Web-based analysis of the mouse transcriptome using Genevestigator

Oliver Laule Matthias Hirsch-Hoffmann Tomas Hruz Wilhelm Gruissem Philip Zimmermann 《BMC bioinformatics》2006,7(1):311-8

Background

Gene function analysis often requires a complex and laborious sequence of laboratory and computer-based experiments. Choosing an effective experimental design generally results from hypotheses derived from prior knowledge or experimentation. Knowledge obtained from meta-analyzing compendia of expression data with annotation libraries can provide significant clues in understanding gene and network function, resulting in better hypotheses that can be tested in the laboratory. 相似文献

9.

Eval: A software package for analysis of genome annotations

Evan?Keibler Michael?R?Brent Email author 《BMC bioinformatics》2003,4(1):50

相似文献

10.

Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release

Brian J Haas Jennifer R Wortman Catherine M Ronning Linda I Hannick Roger K Smith Jr Rama Maiti Agnes P Chan Chunhui Yu Maryam Farzad Dongying Wu Owen White Christopher D Town 《BMC biology》2005,3(1):1-19

Background

Since the initial publication of its complete genome sequence, Arabidopsis thaliana has become more important than ever as a model for plant research. However, the initial genome annotation was submitted by multiple centers using inconsistent methods, making the data difficult to use for many applications.

Results

Over the course of three years, TIGR has completed its effort to standardize the structural and functional annotation of the Arabidopsis genome. Using both manual and automated methods, Arabidopsis gene structures were refined and gene products were renamed and assigned to Gene Ontology categories. We present an overview of the methods employed, tools developed, and protocols followed, summarizing the contents of each data release with special emphasis on our final annotation release (version 5).

Conclusion

Over the entire period, several thousand new genes and pseudogenes were added to the annotation. Approximately one third of the originally annotated gene models were significantly refined yielding improved gene structure annotations, and every protein-coding gene was manually inspected and classified using Gene Ontology terms. 相似文献

11.

SynBlast: Assisting the analysis of conserved synteny information

Jörg Lehmann Peter F Stadler 《BMC bioinformatics》2008,9(1):351

Motivation

In the last years more than 20 vertebrate genomes have been sequenced, and the rate at which genomic DNA information becomes available is rapidly accelerating. Gene duplication and gene loss events inherently limit the accuracy of orthology detection based on sequence similarity alone. Fully automated methods for orthology annotation do exist but often fail to identify individual members in cases of large gene families, or to distinguish missing data from traceable gene losses. This situation can be improved in many cases by including conserved synteny information. 相似文献

12.

ProLoc-GO: Utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization

Wen-Lin Huang Chun-Wei Tung Shih-Wen Ho Shiow-Fen Hwang Shinn-Ying Ho 《BMC bioinformatics》2008,9(1):80

Background

Gene Ontology (GO) annotation, which describes the function of genes and gene products across species, has recently been used to predict protein subcellular and subnuclear localization. Existing GO-based prediction methods for protein subcellular localization use the known accession numbers of query proteins to obtain their annotated GO terms. An accurate prediction method for predicting subcellular localization of novel proteins without known accession numbers, using only the input sequence, is worth developing. 相似文献

13.

Combining evidence, biomedical literature and statistical dependence: new insights for functional annotation of gene sets

Marc Aubry Annabelle Monnier Celine Chicault Marie de Tayrac Marie-Dominique Galibert Anita Burgun Jean Mosser 《BMC bioinformatics》2006,7(1):241

相似文献

14.

The comparative analysis of statistics,based on the likelihood ratio criterion,in the automated annotation problem

Andrey M Leontovich Konstantin Y Tokmachev Hans C van Houwelingen 《BMC bioinformatics》2008,9(1):31

Background

This paper discusses the problem of automated annotation. It is a continuation of the previous work on the A⁴-algorithm (Adaptive algorithm of automated annotation) developed by Leontovich and others. 相似文献

15.

Formalization of taxon-based constraints to detect inconsistencies in annotation and ontology development

Jennifer?I?Deegan Emily?C?Dimmer Christopher?J?Mungall 《BMC bioinformatics》2010,11(1):530

Background

The Gene Ontology project supports categorization of gene products according to their location of action, the molecular functions that they carry out, and the processes that they are involved in. Although the ontologies are intentionally developed to be taxon neutral, and to cover all species, there are inherent taxon specificities in some branches. For example, the process 'lactation' is specific to mammals and the location 'mitochondrion' is specific to eukaryotes. The lack of an explicit formalization of these constraints can lead to errors and inconsistencies in automated and manual annotation. 相似文献

16.

A fast structural multiple alignment method for long RNA sequences

Yasuo Tabei Hisanori Kiryu Taishin Kin Kiyoshi Asai 《BMC bioinformatics》2008,9(1):1-17

Background

Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data.

Results

In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general.

Conclusion

GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values. 相似文献

17.

Optimizing gene set annotations combining GO structure and gene expression data

Dong Wang Jie Li Rui Liu Yadong Wang 《BMC systems biology》2018,12(9):133

Background

With the rapid accumulation of genomic data, it has become a challenge issue to annotate and interpret these data. As a representative, Gene set enrichment analysis has been widely used to interpret large molecular datasets generated by biological experiments. The result of gene set enrichment analysis heavily relies on the quality and integrity of gene set annotations. Although several methods were developed to annotate gene sets, there is still a lack of high quality annotation methods. Here, we propose a novel method to improve the annotation accuracy through combining the GO structure and gene expression data.

Results

We propose a novel approach for optimizing gene set annotations to get more accurate annotation results. The proposed method filters the inconsistent annotations using GO structure information and probabilistic gene set clusters calculated by a range of cluster sizes over multiple bootstrap resampled datasets. The proposed method is employed to analyze p53 cell lines, colon cancer and breast cancer gene expression data. The experimental results show that the proposed method can filter a number of annotations unrelated to experimental data and increase gene set enrichment power and decrease the inconsistent of annotations.

Conclusions

A novel gene set annotation optimization approach is proposed to improve the quality of gene annotations. Experimental results indicate that the proposed method effectively improves gene set annotation quality based on the GO structure and gene expression data.

相似文献

18.

IntelliGO: a new vector-based semantic similarity measure including annotation origin

Sidahmed Benabderrahmane Malika Smail-Tabbone Olivier Poch Amedeo Napoli Marie-Dominique Devignes 《BMC bioinformatics》2010,11(1):588

Background

The Gene Ontology (GO) is a well known controlled vocabulary describing the biological process, molecular function and cellular component aspects of gene annotation. It has become a widely used knowledge source in bioinformatics for annotating genes and measuring their semantic similarity. These measures generally involve the GO graph structure, the information content of GO aspects, or a combination of both. However, only a few of the semantic similarity measures described so far can handle GO annotations differently according to their origin (i.e. their evidence codes). 相似文献

19.

Transcript-level annotation of Affymetrix probesets improves the interpretation of gene expression data

Hui Yu Feng Wang Kang Tu Lu Xie Yuan-Yuan Li Yi-Xue Li 《BMC bioinformatics》2007,8(1):194

相似文献

20.

EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries

Robin P Smith William J Buchser Marcus B Lemmon Jose R Pardinas John L Bixby Vance P Lemmon 《BMC bioinformatics》2008,9(1):186

Background

Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. 相似文献