共查询到20条相似文献,搜索用时 17 毫秒
1.
RiceGAAS: an automated annotation system and database for rice genome sequence 总被引:27,自引:0,他引:27 下载免费PDF全文
Katsumi Sakata Yoshiaki Nagamura Hisataka Numa Baltazar A. Antonio Hideki Nagasaki Atsuko Idonuma Wakako Watanabe Yuji Shimizu Ikuo Horiuchi Takashi Matsumoto Takuji Sasaki Kenichi Higo 《Nucleic acids research》2002,30(1):98-102
An extensive effort of the International Rice Genome Sequencing Project (IRGSP) has resulted in rapid accumulation of genome sequence, and >137 Mb has already been made available to the public domain as of August 2001. This requires a high-throughput annotation scheme to extract biologically useful and timely information from the sequence data on a regular basis. A new automated annotation system and database called Rice Genome Automated Annotation System (RiceGAAS) has been developed to execute a reliable and up-to-date analysis of the genome sequence as well as to store and retrieve the results of annotation. The system has the following functional features: (i) collection of rice genome sequences from GenBank; (ii) execution of gene prediction and homology search programs; (iii) integration of results from various analyses and automatic interpretation of coding regions; (iv) re-execution of analysis, integration and automatic interpretation with the latest entries in reference databases; (v) integrated visualization of the stored data using web-based graphical view. RiceGAAS also has a data submission mechanism that allows public users to perform fully automated annotation of their own sequences. The system can be accessed at http://RiceGAAS.dna.affrc.go.jp/. 相似文献
2.
The TIGR rice genome annotation resource: annotating the rice genome and creating resources for plant biologists 总被引:1,自引:0,他引:1
Yuan Q Ouyang S Liu J Suh B Cheung F Sultana R Lee D Quackenbush J Buell CR 《Nucleic acids research》2003,31(1):229-233
Rice is not only a major food staple for the world's population but it also is a model species for a major group of flowering plants, the monocotyledonous plants. Draft genomic sequence of two subspecies of rice, Oryza sativa spp. japonica and indica ssp. are publicly available. To provide the community with a resource to data-mine the rice genome, we have constructed an annotation resource for rice (http://www.tigr.org/tdb/e2k1/osa1/). In this resource, we have annotated the rice genome for gene content, identified motifs/domains within the predicted genes, constructed a rice repeat database, identified related sequences in other plant species, and identified syntenic sequences between rice and maize. All of the data is available through web-based interfaces, FTP downloads, and a Distributed Annotation System. 相似文献
3.
4.
5.
Several institutions provide genomic annotation data, and therefore these data show a significant segmentation and redundancy. Public databases allow access, through their own methods, to genomic and proteomic sequences and related annotation. Although some cross-reference tables are available, they don't cover the complete datasets provided by these databases. The Genomic Annotation Gathering project intends to unify annotation data provided by GenBank and Ensembl. We introduce an intra-species, cross-bank method. Generated results provide an enriched set of cross- references. This method allows for identifying an average of 30% of new cross-references that can be integrated to other utilities dedicated to analyzing related annotation data. By using only sequence comparison, we are able to unify two datasets that previously didn't share any stable cross-bank accession method. The whole process is hosted by the GenOuest platform to provide public access to newly generated cross-references and to allow for regular updates (http://gag.genouest.org). 相似文献
6.
7.
8.
Zea mays DataBase (ZmDB) seeks to provide a comprehensive view of maize (corn) genetics by linking genomic sequence data with gene expression analysis and phenotypes of mutant plants. ZmDB originated in 1999 as the Web portal for a large project of maize gene discovery, sequencing and phenotypic analysis using a transposon tagging strategy and expressed sequence tag (EST) sequencing. Recently, ZmDB has broadened its scope to include all public maize ESTs, genome survey sequences (GSSs), and protein sequences. More than 170 000 ESTs are currently clustered into approximately 20 000 contigs and about an equal number of apparent singlets. These clusters are continuously updated and annotated with respect to potential encoded protein products. More than 100 000 GSSs are similarly assembled and annotated by spliced alignment with EST and protein sequences. The ZmDB interface provides quick access to analytical tools for further sequence analysis. Every sequence record is linked to several display options and similarity search tools, including services for multiple sequence alignment, protein domain determination and spliced alignment. Furthermore, ZmDB provides web-based ordering of materials generated in the project, including ESTs, ordered collections of genomic sequences tagged with the RescueMu transposon and microarrays of amplified ESTs. ZmDB can be accessed at http://zmdb.iastate.edu/. 相似文献
9.
基因组序列为昆虫分子生物学研究提供丰富的数据资源,推动系统生物学在古老的昆虫学中蓬勃发展。昆虫基因组学研究已经成为当前的研究热点,目前在NCBI登录注册的昆虫基因组测序计划有494项,其中已提交原始测序数据的昆虫有225种,完成基因组拼接的有215种,具有基因注释的有65种,公开发表的昆虫基因组有43篇。本文综述了测序技术发展的历史及其对昆虫基因组研究的推动作用、昆虫基因组的组装和注释及其存在的问题、昆虫基因组测序进展、昆虫基因组数据库的发展及基因数据挖掘利用的基本思路和对策,以及昆虫基因大数据在害虫防治和资源昆虫利用中的应用前景。 相似文献
10.
Chi-Ching Lee Yi-Ping Phoebe Chen Tzu-Jung Yao Cheng-Yu Ma Wei-Cheng Lo Ping-Chiang Lyu Chuan Yi Tang 《Gene》2013
Sequencing of microbial genomes is important because of microbial-carrying antibiotic and pathogenetic activities. However, even with the help of new assembling software, finishing a whole genome is a time-consuming task. In most bacteria, pathogenetic or antibiotic genes are carried in genomic islands. Therefore, a quick genomic island (GI) prediction method is useful for ongoing sequencing genomes. In this work, we built a Web server called GI-POP (http://gipop.life.nthu.edu.tw) which integrates a sequence assembling tool, a functional annotation pipeline, and a high-performance GI predicting module, in a support vector machine (SVM)-based method called genomic island genomic profile scanning (GI-GPS). The draft genomes of the ongoing genome projects in contigs or scaffolds can be submitted to our Web server, and it provides the functional annotation and highly probable GI-predicting results. GI-POP is a comprehensive annotation Web server designed for ongoing genome project analysis. Researchers can perform annotation and obtain pre-analytic information include possible GIs, coding/non-coding sequences and functional analysis from their draft genomes. This pre-analytic system can provide useful information for finishing a genome sequencing project. 相似文献
11.
12.
The Genomic Threading Database currently contains structural annotations for the genomes of over 100 recently sequenced organisms. Annotations are carried out by using our modified GenTHREADER software and through implementing grid technology. AVAILABILITY: http://bioinf.cs.ucl.ac.uk/GTD 相似文献
13.
Danchin EG Levasseur A Rascol VL Gouret P Pontarotti P 《Journal of experimental zoology. Part B. Molecular and developmental evolution》2007,308(1):26-36
The past decade has seen the completion of numerous whole-genome sequencing projects, began with bacterial genomes and continued with eukaryotic species from different phyla: fungi, plants and animals. Besides, more biological information are produced and are shared thanks to information exchange systems, and more biological concepts, as well as more bioinformatics tools, are available. In this article, we will describe how the evolutionary biology concepts, as well as computer science, are useful for a better understanding of biology in general and genome annotation in particular. The genome annotation process consists of taking the raw DNA produced, for example, by the genome sequencing projects, adding the layers of analysis and interpretation necessary to extract its biological significance and placing it in the context of our understanding of biological processes. Genome annotation is a multistep process falling into two broad categories: structural and functional annotation. 相似文献
14.
The rice (Oryza sativa) genome contains 1,429 protein kinases, the vast majority of which have unknown functions. We created a phylogenomic database (http://rkd.ucdavis.edu) to facilitate functional analysis of this large gene family. Sequence and genomic data, including gene expression data and protein-protein interaction maps, can be displayed for each selected kinase in the context of a phylogenetic tree allowing for comparative analysis both within and between large kinase subfamilies. Interaction maps are easily accessed through links and displayed using Cytoscape, an open source software platform. Chromosomal distribution of all rice kinases can also be explored via an interactive interface. 相似文献
15.
The phytophthora genome initiative database: informatics and analysis for distributed pathogenomic research 总被引:4,自引:0,他引:4 下载免费PDF全文
Waugh M Hraber P Weller J Wu Y Chen G Inman J Kiphart D Sobral B 《Nucleic acids research》2000,28(1):87-90
The Phytophthora Genome Initiative (PGI) is a distributed collaboration to study the genome and evolution of a particularly destructive group of plant pathogenic oomycete, with the goal of understanding the mechanisms of infection and resistance. NCGR provides informatics support for the collaboration as well as a centralized data repository. In the pilot phase of the project, several investigators prepared Phytophthora infestans and Phytophthora sojae EST and Phytophthora sojae BAC libraries and sent them to another laboratory for sequencing. Data from sequencing reactions were transferred to NCGR for analysis and curation. An analysis pipeline transforms raw data by performing simple analyses (i.e., vector removal and similarity searching) that are stored and can be retrieved by investigators using a web browser. Here we describe the database and access tools, provide an overview of the data therein and outline future plans. This resource has provided a unique opportunity for the distributed, collaborative study of a genus from which relatively little sequence data are available. Results may lead to insight into how better to control these pathogens. The homepage of PGI can be accessed at http:www.ncgr.org/pgi, with database access through the database access hyperlink. 相似文献
16.
HOWDY: an integrated database system for human genome research 总被引:1,自引:0,他引:1
Mika Hirakawa 《Nucleic acids research》2002,30(1):152-157
HOWDY is an integrated database system for accessing and analyzing human genomic information (http://www-alis.tokyo.jst.go.jp/HOWDY/). HOWDY stores information about relationships between genetic objects and the data extracted from a number of databases. HOWDY consists of an Internet accessible user interface that allows thorough searching of the human genomic databases using the gene symbols and their aliases. It also permits flexible editing of the sequence data. The database can be searched using simple words and the search can be restricted to a specific cytogenetic location. Linear maps displaying markers and genes on contig sequences are available, from which an object can be chosen. Any search starting point identifies all the information matching the query. HOWDY provides a convenient search environment of human genomic data for scientists unsure which database is most appropriate for their search. 相似文献
17.
Frishman D Mokrejs M Kosykh D Kastenmüller G Kolesov G Zubrzycki I Gruber C Geier B Kaps A Albermann K Volz A Wagner C Fellenberg M Heumann K Mewes HW 《Nucleic acids research》2003,31(1):207-211
The PEDANT genome database (http://pedant.gsf.de) provides exhaustive automatic analysis of genomic sequences by a large variety of established bioinformatics tools through a comprehensive Web-based user interface. One hundred and seventy seven completely sequenced and unfinished genomes have been processed so far, including large eukaryotic genomes (mouse, human) published recently. In this contribution, we describe the current status of the PEDANT database and novel analytical features added to the PEDANT server in 2002. Those include: (i) integration with the BioRS data retrieval system which allows fast text queries, (ii) pre-computed sequence clusters in each complete genome, (iii) a comprehensive set of tools for genome comparison, including genome comparison tables and protein function prediction based on genomic context, and (iv) computation and visualization of protein-protein interaction (PPI) networks based on experimental data. The availability of functional and structural predictions for 650 000 genomic proteins in well organized form makes PEDANT a useful resource for both functional and structural genomics. 相似文献
18.
The advent of fully sequenced genomes opens the ground for the reconstruction of metabolic pathways on the basis of the identification of enzyme-coding genes. Here we describe PRIAM, a method for automated enzyme detection in a fully sequenced genome, based on the classification of enzymes in the ENZYME database. PRIAM relies on sets of position-specific scoring matrices ('profiles') automatically tailored for each ENZYME entry. Automatically generated logical rules define which of these profiles is required in order to infer the presence of the corresponding enzyme in an organism. As an example, PRIAM was applied to identify potential metabolic pathways from the complete genome of the nitrogen-fixing bacterium Sinorhizobium meliloti. The results of this automated method were compared with the original genome annotation and visualised on KEGG graphs in order to facilitate the interpretation of metabolic pathways and to highlight potentially missing enzymes. 相似文献
19.
T. Hubbard D. Barker E. Birney G. Cameron Y. Chen L. Clark T. Cox J. Cuff V. Curwen T. Down R. Durbin E. Eyras J. Gilbert M. Hammond L. Huminiecki A. Kasprzyk H. Lehvaslaiho P. Lijnzaad C. Melsopp E. Mongin R. Pettett M. Pocock S. Potter A. Rust E. Schmidt S. Searle G. Slater J. Smith W. Spooner A. Stabenau J. Stalker E. Stupka A. Ureta-Vidal I. Vastrik M. Clamp 《Nucleic acids research》2002,30(1):38-41
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops. 相似文献