首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
4.
5.
More than 300 bacterial genome sequences are publicly available, and many more are scheduled to be completed and released in the near future. Converting this raw sequence information into a better understanding of the biology of bacteria involves the identification and annotation of genes, proteins and pathways. This processing is typically done using sequence annotation pipelines comprised of a variety of software modules and, in some cases, human experts. The reference databases, computational methods and knowledge that form the basis of these pipelines are constantly evolving, and thus there is a need to reprocess genome annotations on a regular basis. The combined challenge of revising existing annotations and extracting useful information from the flood of new genome sequences will necessitate more reliance on completely automated systems.  相似文献   

6.

Background  

Genome annotation can be viewed as an incremental, cooperative, data-driven, knowledge-based process that involves multiple methods to predict gene locations and structures. This process might have to be executed more than once and might be subjected to several revisions as the biological (new data) or methodological (new methods) knowledge evolves. In this context, although a lot of annotation platforms already exist, there is still a strong need for computer systems which take in charge, not only the primary annotation, but also the update and advance of the associated knowledge. In this paper, we propose to adopt a blackboard architecture for designing such a system  相似文献   

7.
Structural proteomics: a tool for genome annotation   总被引:1,自引:0,他引:1  
In any newly sequenced genome, 30% to 50% of genes encode proteins with unknown molecular or cellular function. Fortunately, structural genomics is emerging as a powerful approach of functional annotation. Because of recent developments in high-throughput technologies, ongoing structural genomics projects are generating new structures at an unprecedented rate. In the past year, structural studies have identified many new structural motifs involved in enzymatic catalysis or in binding ligands or other macromolecules (DNA, RNA, protein). The efficiency by which function is deduced from structure can be further improved by the integration of structure with bioinformatics and other experimental approaches, such as screening for enzymatic activity or ligand binding.  相似文献   

8.
9.
10.
In view of the recent explosion in genome sequence data, and the 200 or more complete genome sequences currently available, the importance of genome-scale bioinformatics analysis is increasing rapidly. However, computational genome informatics analyses often lack a statistical assessment of their sensitivity to the completeness of the functional annotation. Therefore, a pre-analysis method to automatically validate the sensitivity of computational genome analyses with regard to genome annotation completeness is useful for this purpose. In this report we developed the Gene Prediction Accuracy Classification (GPAC) test, which provides statistical evidence of sensitivity by repeating the same analysis for five different gene groups (classified according to annotation accuracy level), and for randomly sampled gene groups, with the same number of genes as each of the five classified groups. Variability in these results is then assessed, and if the results vary significantly with different data subsets, the analysis is considered "sensitive" to annotation completeness, and careful selection of data is advised prior to the actual in silico analysis. The GPAC test has been applied to the analyses of Sakai et al., 2001, and Ohno et al., 2001, and it revealed that the analysis of Ohno et al. was more sensitive to annotation completeness. It showed that GPAC could be employed to ascertain the sensitivity of an analysis. The GPAC bendhmarking software is freely available in the latest G-language Genome Analysis Environment package, at http://www.g-language.org/.  相似文献   

11.
BACKGROUND: The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. METHODOLOGY: The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.  相似文献   

12.
13.
14.
After sequencing the human and mouse genomes, the annotation of these sequences with biological functions is an important challenge in genomic research. A major tool to analyse gene function on the organismal level is the analysis of mutant phenotypes. Because of its genetic and physiological similarity to man, the mouse has become the model organism of choice for the study of genetic diseases. In addition, there is at the moment no other vertebrate for which versatile techniques to manipulate the genome are as well developed. Several mouse mutagenesis projects have provided the proof-of-principle that a systematic and comprehensive mutagenesis of every gene in the mammalian genome will be feasible. An exhaustive functional annotation of the mammalian genome can only be achieved in a combination of phenotype- and gene-driven approaches in large- and small-scale academic and private projects. Major challenges will be to develop standardised phenotyping protocols for the clinical and pathological characterisation of mouse mutants, the improvement of mutation detection methods and the dissemination of resources and data. Beyond gene annotation, it will be necessary to understand how gene functions are integrated into the complex network of regulatory interactions in the cell.  相似文献   

15.
It is widely recognized that, with the advent of very high throughput, short read, and highly parallelized sequencing technologies, the generation of new DNA sequences from microbes, plants, metagenomes is outpacing the ability to assign functions to ("annotate") all this data. To begin to try to address this, on May 18 and 19, 2010, a team of roughly fifty people met to define and scope the possibility of a first Critical Assessment of Functional Annotation Experiment (CAFAE) for bacterial genome annotation in Crystal City, Virginia. Due to the fundamental importance of genomic data to its mission, the Department of Energy (DOE) BER program hosted this workshop, funding the attendance of all invitees. The workshop was co-organized by Dan Drell and Susan Gregurick (DOE), Owen White and Nikos Kyripides.  相似文献   

16.
Applications of InterPro in protein annotation and genome analysis   总被引:2,自引:0,他引:2  
The applications of InterPro span a range of biologically important areas that includes automatic annotation of protein sequences and genome analysis. In automatic annotation of protein sequences InterPro has been utilised to provide reliable characterisation of sequences, identifying them as candidates for functional annotation. Rules based on the InterPro characterisation are stored and operated through a database called RuleBase. RuleBase is used as the main tool in the sequence database group at the EBI to apply automatic annotation to unknown sequences. The annotated sequences are stored and distributed in the TrEMBL protein sequence database. InterPro also provides a means to carry out statistical and comparative analyses of whole genomes. In the Proteome Analysis Database, InterPro analyses have been combined with other analyses based on CluSTr, the Gene Ontology (GO) and structural information on the proteins.  相似文献   

17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号