共查询到20条相似文献,搜索用时 0 毫秒
1.
RiceGAAS: an automated annotation system and database for rice genome sequence 总被引:27,自引:0,他引:27 下载免费PDF全文
Katsumi Sakata Yoshiaki Nagamura Hisataka Numa Baltazar A. Antonio Hideki Nagasaki Atsuko Idonuma Wakako Watanabe Yuji Shimizu Ikuo Horiuchi Takashi Matsumoto Takuji Sasaki Kenichi Higo 《Nucleic acids research》2002,30(1):98-102
An extensive effort of the International Rice Genome Sequencing Project (IRGSP) has resulted in rapid accumulation of genome sequence, and >137 Mb has already been made available to the public domain as of August 2001. This requires a high-throughput annotation scheme to extract biologically useful and timely information from the sequence data on a regular basis. A new automated annotation system and database called Rice Genome Automated Annotation System (RiceGAAS) has been developed to execute a reliable and up-to-date analysis of the genome sequence as well as to store and retrieve the results of annotation. The system has the following functional features: (i) collection of rice genome sequences from GenBank; (ii) execution of gene prediction and homology search programs; (iii) integration of results from various analyses and automatic interpretation of coding regions; (iv) re-execution of analysis, integration and automatic interpretation with the latest entries in reference databases; (v) integrated visualization of the stored data using web-based graphical view. RiceGAAS also has a data submission mechanism that allows public users to perform fully automated annotation of their own sequences. The system can be accessed at http://RiceGAAS.dna.affrc.go.jp/. 相似文献
2.
A wide range of web based prediction and annotation tools are frequently used for determining protein function from sequence. However, parallel processing of sequences for annotation through web tools is not possible due to several constraints in functional programming for multiple queries. Here, we propose the development of APAF as an automated protein annotation filter to overcome some of these difficulties through an integrated approach. 相似文献
3.
An integrated computational pipeline and database to support whole-genome sequence annotation 下载免费PDF全文
Mungall CJ Misra S Berman BP Carlson J Frise E Harris N Marshall B Shu S Kaminker JS Prochnik SE Smith CD Smith E Tupy JL Wiel C Rubin GM Lewis SE 《Genome biology》2002,3(12):research0081.1-8111
We describe here our experience in annotating the Drosophila melanogaster genome sequence, in the course of which we developed several new open-source software tools and a database schema to support large-scale genome annotation. We have developed these into an integrated and reusable software system for whole-genome annotation. The key contributions to overall annotation quality are the marshalling of high-quality sequences for alignments and the design of a system with an adaptable and expandable flexible architecture. 相似文献
4.
5.
In the recent past, there has been a resurgence of interest in Chikungunya virus (CHIKV) attributed to massive outbreaks of Chikungunya fever in the South-East Asia Region. This has reflected in substantial increase in submission of CHIKV genome sequences to NCBI (National Center for Biotechnology Information) database. Hereby we submit a database "CHIKVPRO" containing structural and functional annotation of Chikungunya virus proteins (25 strains) submitted in the NCBI repository. The CHIKV genome encodes for 9 proteins:4 non-structural and 5 structural. The CHIKVPRO database aims to provide the virology community with a single accession authoritative resource for CHIKV proteome- with reference to physiochemical and molecular properties, proteolytic cleavage sites, hydrophobicity, transmembrane prediction, and classification into functional families using SVMProt and other Expasy tools. AVAILABILITY: The database is freely available at http://www.chikvpro.info/ 相似文献
6.
Durham AM Kashiwabara AY Matsunaga FT Ahagon PH Rainone F Varuzza L Gruber A 《Bioinformatics (Oxford, England)》2005,21(12):2812-2813
SUMMARY: EGene is a generic, flexible and modular pipeline generation system that makes pipeline construction a modular job. EGene allows for third-party programs to be used and integrated according to the needs of distinct projects and without any previous programming or formal language experience being required. EGene comes with CoEd, a visual tool to facilitate pipeline construction and documentation. A series of components to build pipelines for sequence processing is provided. AVAILABILITY: http://www.lbm.fmvz.usp.br/egene/ CONTACT: alan@ime.usp.br; argruber@usp.br SUPPLEMENTARY INFORMATION: http://www.lbm.fmvz.usp.br/egene/ 相似文献
7.
8.
Manas Ranjan Dikhit Sindhu Prava Rana Pradeep Das Ganesh Chandra Sahoo 《Bioinformation》2009,3(7):299-302
Databases containing proteomic information have become indispensable for virology studies. As the gap between the amount of sequence information and functional characterization widens, increasing efforts are being directed to the development of databases. For virologist, it is therefore desirable to have a single data collection point which integrates research related data from different domains. CHPVDB is our effort to provide virologist such a one‐step information center. We describe herein the creation of CHPVDB, a new database that integrates information of different proteins in to a single resource. For basic curation of protein information, the database relies on features from other selected databases, servers and published reports. This database facilitates significant relationship between molecular analysis, cleavage sites, possible protein functional families assigned to different proteins of Chandipura virus (CHPV) by SVMProt and related tools. 相似文献
9.
UniSave: the UniProtKB sequence/annotation version database 总被引:1,自引:0,他引:1
SUMMARY: The UniProtKB Sequence/Annotation Version database (UniSave) is a comprehensive archive of UniProtKB/Swiss-Prot and UniProtKB/TrEMBL entry versions. All changed Swiss-Prot and TrEMBL entries are loaded into the UniSave as part of the public bi-weekly UniProtKB releases. Unlike the UniProtKB, which contains only the latest Swiss-Prot and TrEMBL entry versions, the UniSave provides access to previous versions of these entries. AVAILABILITY: http://www.ebi.ac.uk/uniprot/unisave 相似文献
10.
11.
ASAP,a systematic annotation package for community analysis of genomes 总被引:10,自引:0,他引:10
Glasner JD Liss P Plunkett G Darling A Prasad T Rusch M Byrnes A Gilson M Biehl B Blattner FR Perna NT 《Nucleic acids research》2003,31(1):147-151
12.
Chenggang Yu Nela Zavaljevski Valmik Desai Seth Johnson Fred J Stevens Jaques Reifman 《BMC bioinformatics》2008,9(1):52
Background
Automated protein function prediction methods are needed to keep pace with high-throughput sequencing. With the existence of many programs and databases for inferring different protein functions, a pipeline that properly integrates these resources will benefit from the advantages of each method. However, integrated systems usually do not provide mechanisms to generate customized databases to predict particular protein functions. Here, we describe a tool termed PIPA (Pipeline for Protein Annotation) that has these capabilities. 相似文献13.
Eduardo Lee Gregg A Helt Justin T Reese Monica C Munoz-Torres Chris P Childers Robert M Buels Lincoln Stein Ian H Holmes Christine G Elsik Suzanna E Lewis 《Genome biology》2013,14(8):R93
Web Apollo is the first instantaneous, collaborative genomic annotation editor available on the web. One of the natural consequences following from current advances in sequencing technology is that there are more and more researchers sequencing new genomes. These researchers require tools to describe the functional features of their newly sequenced genomes. With Web Apollo researchers can use any of the common browsers (for example, Chrome or Firefox) to jointly analyze and precisely describe the features of a genome in real time, whether they are in the same room or working from opposite sides of the world. 相似文献
14.
Background
Next-generation sequencing (NGS) has yielded an unprecedented amount of data for genetics research. It is a daunting task to process the data from raw sequence reads to variant calls and manually processing this data can significantly delay downstream analysis and increase the possibility for human error. The research community has produced tools to properly prepare sequence data for analysis and established guidelines on how to apply those tools to achieve the best results, however, existing pipeline programs to automate the process through its entirety are either inaccessible to investigators, or web-based and require a certain amount of administrative expertise to set up.Findings
Advanced Sequence Automated Pipeline (ASAP) was developed to provide a framework for automating the translation of sequencing data into annotated variant calls with the goal of minimizing user involvement without the need for dedicated hardware or administrative rights. ASAP works both on computer clusters and on standalone machines with minimal human involvement and maintains high data integrity, while allowing complete control over the configuration of its component programs. It offers an easy-to-use interface for submitting and tracking jobs as well as resuming failed jobs. It also provides tools for quality checking and for dividing jobs into pieces for maximum throughput.Conclusions
ASAP provides an environment for building an automated pipeline for NGS data preprocessing. This environment is flexible for use and future development. It is freely available at http://biostat.mc.vanderbilt.edu/ASAP. 相似文献15.
Background
The functional annotation of proteins relies on published information concerning their close and remote homologues in sequence databases. Evidence for remote sequence similarity can be further strengthened by a similar biological background of the query sequence and identified database sequences. However, few tools exist so far, that provide a means to include functional information in sequence database searches. 相似文献16.
Lewis SE Searle SM Harris N Gibson M Lyer V Richter J Wiel C Bayraktaroglir L Birney E Crosby MA Kaminker JS Matthews BB Prochnik SE Smithy CD Tupy JL Rubin GM Misra S Mungall CJ Clamp ME 《Genome biology》2002,3(12):research0082.1-8214
The well-established inaccuracy of purely computational methods for annotating genome sequences necessitates an interactive tool to allow biological experts to refine these approximations by viewing and independently evaluating the data supporting each annotation. Apollo was developed to meet this need, enabling curators to inspect genome annotations closely and edit them. FlyBase biologists successfully used Apollo to annotate the Drosophila melanogaster genome and it is increasingly being used as a starting point for the development of customized annotation editing tools for other genome projects. 相似文献
17.
Ferrer-Costa C Gelpí JL Zamakola L Parraga I de la Cruz X Orozco M 《Bioinformatics (Oxford, England)》2005,21(14):3176-3178
PMUT allows the fast and accurate prediction (approximately 80% success rate in humans) of the pathological character of single point amino acidic mutations based on the use of neural networks. The program also allows the fast scanning of mutational hot spots, which are obtained by three procedures: (1) alanine scanning, (2) massive mutation and (3) genetically accessible mutations. A graphical interface for Protein Data Bank (PDB) structures, when available, and a database containing hot spot profiles for all non-redundant PDB structures are also accessible from the PMUT server. 相似文献
18.
Laboratories working with draft phase genomes have specific software needs, such as the unattended processing of hundreds of single scaffolds and subsequent sequence annotation. In addition, it is critical to follow the "movement" and the manual annotation of single open reading frames (ORFs) within the successive sequence updates. Even with finished genomes, regular database updates can lead to significant changes in the annotation of single ORFs. In functional genomics it is important to mine data and identify new genetic targets rapidly and easily. Often there is no need for sophisticated relational databases (RDB) that greatly reduce the system-independent access of the results. Another aspect is the internet dependency of most software packages. If users are working with confidential data, this dependency poses a security issue. GAMOLA was designed to handle the numerous scaffolds and changing contents of draft phase genomes in an automated process and stores the results for each predicted ORF in flatfile databases. In addition, annotation transfers, ORF designation tracking, Blast comparisons, and primer design for whole genome microarrays have been implemented. The software is available under the license of North Carolina State University. A website and a downloadable example are accessible under (http://fsweb2.schaub. ncsu.edu/TRKwebsite/index.htm). 相似文献
19.
Malmström L Malmström J Marko-Varga G Westergren-Thorsson G 《Journal of proteome research》2002,1(2):135-138
We present a software solution that enables faster and more accurate data analysis of 2DE/MALDI TOF MS data. The software supports data analysis through a number of automated data selection functions and advanced graphical tools. Once protein identities are determined using MALDI TOF MS, automated data retrieval from online databases provides biological information. The software, called 2DDB, reduces analysis time to a fraction without losing any quality compared to more manual data analysis. The database contains over 100,000 data entries, and selected parts can be reached at http://2ddb.org. 相似文献