共查询到20条相似文献,搜索用时 0 毫秒
1.
SUMMARY: The Gandr (gene annotation data representation) knowledgebase is an ontological framework for laboratory-specific gene annotation. Gandr uses Protege 2000 for editing, querying and visualizing microarray data and annotations. Genes can be annotated with provided, newly created or imported ontological concepts. Annotated genes can inherit assigned concept properties and can be related to each other. The resulting knowledgebase can be visualized as interactive network of nodes and edges representing genes and their functional relationships. This allows for immediate and associative gene context exploration. Ontological query techniques allow for powerful data access. 相似文献
2.
Artemis is a widely used software tool for annotating and viewing sequence data. No database is required to use Artemis. Instead, individual sequence data files can be analysed with little or no formatting, making it particularly suited to the study of small genomes and chromosomes, and straightforward for a novice user to get started. Since its release in 1999, Artemis has been used to annotate a diverse collection of prokaryotic and eukaryotic genomes, ranging from Streptomyces coelicolor to, more recently, a large proportion of the Plasmodium falciparum genome. Artemis allows annotated genomes to be easily browsed and makes it simple to add useful biological information to raw sequence data. This paper gives an overview of some of the features of Artemis and includes how it facilitates manual gene prediction and can provide an overview of entire chromosomes or small compact genomes--useful for uncovering unusual features such as pathogenicity islands. 相似文献
3.
MOTIVATION: A number of free-standing programs have been developed in order to help researchers find potential coding regions and deduce gene structure for long stretches of what is essentially 'anonymous DNA'. As these programs apply inherently different criteria to the question of what is and is not a coding region, multiple algorithms should be used in the course of positional cloning and positional candidate projects to assure that all potential coding regions within a previously-identified critical region are identified. RESULTS: We have developed a gene identification tool called GeneMachine which allows users to query multiple exon and gene prediction programs in an automated fashion. BLAST searches are also performed in order to see whether a previously-characterized coding region corresponds to a region in the query sequence. A suite of Perl programs and modules are used to run MZEF, GENSCAN, GRAIL 2, FGENES, RepeatMasker, Sputnik, and BLAST. The results of these runs are then parsed and written into ASN.1 format. Output files can be opened using NCBI Sequin, in essence using Sequin as both a workbench and as a graphical viewer. The main feature of GeneMachine is that the process is fully automated; the user is only required to launch GeneMachine and then open the resulting file with Sequin. Annotations can then be made to these results prior to submission to GenBank, thereby increasing the intrinsic value of these data. AVAILABILITY: GeneMachine is freely-available for download at http://genome.nhgri.nih.gov/genemachine. A public Web interface to the GeneMachine server for academic and not-for-profit users is available at http://genemachine.nhgri.nih.gov. The Web supplement to this paper may be found at http://genome.nhgri.nih.gov/genemachine/supplement/. 相似文献
4.
5.
Lewis SE Searle SM Harris N Gibson M Lyer V Richter J Wiel C Bayraktaroglir L Birney E Crosby MA Kaminker JS Matthews BB Prochnik SE Smithy CD Tupy JL Rubin GM Misra S Mungall CJ Clamp ME 《Genome biology》2002,3(12):research0082.1-8214
The well-established inaccuracy of purely computational methods for annotating genome sequences necessitates an interactive tool to allow biological experts to refine these approximations by viewing and independently evaluating the data supporting each annotation. Apollo was developed to meet this need, enabling curators to inspect genome annotations closely and edit them. FlyBase biologists successfully used Apollo to annotate the Drosophila melanogaster genome and it is increasingly being used as a starting point for the development of customized annotation editing tools for other genome projects. 相似文献
6.
7.
TEnest: automated chronological annotation and visualization of nested plant transposable elements 总被引:2,自引:0,他引:2
Organisms with a high density of transposable elements (TEs) exhibit nesting, with subsequent repeats found inside previously inserted elements. Nesting splits the sequence structure of TEs and makes annotation of repetitive areas challenging. We present TEnest, a repeat identification and display tool made specifically for highly repetitive genomes. TEnest identifies repetitive sequences and reconstructs separated sections to provide full-length repeats and, for long-terminal repeat (LTR) retrotransposons, calculates age since insertion based on LTR divergence. TEnest provides a chronological insertion display to give an accurate visual representation of TE integration history showing timeline, location, and families of each TE identified, thus creating a framework from which evolutionary comparisons can be made among various regions of the genome. A database of repeats has been developed for maize (Zea mays), rice (Oryza sativa), wheat (Triticum aestivum), and barley (Hordeum vulgare) to illustrate the potential of TEnest software. All currently finished maize bacterial artificial chromosomes totaling 29.3 Mb were analyzed with TEnest to provide a characterization of the repeat insertions. Sixty-seven percent of the maize genome was found to be made up of TEs; of these, 95% are LTR retrotransposons. The rate of solo LTR formation is shown to be dissimilar across retrotransposon families. Phylogenetic analysis of TE families reveals specific events of extreme TE proliferation, which may explain the high quantities of certain TE families found throughout the maize genome. The TEnest software package is available for use on PlantGDB under the tools section (http://www.plantgdb.org/prj/TE_nest/TE_nest.html); the source code is available from (http://wiselab.org). 相似文献
8.
9.
10.
Using a previously developed automated method for enzyme annotation, we report the re-annotation of the ENZYME database and the analysis of local error rates per class. In control experiments, we demonstrate that the method is able to correctly re-annotate 91% of all Enzyme Classification (EC) classes with high coverage (755 out of 827). Only 44 enzyme classes are found to contain false positives, while the remaining 28 enzyme classes are not represented. We also show cases where the re-annotation procedure results in partial overlaps for those few enzyme classes where a certain inconsistency might appear between homologous proteins, mostly due to function specificity. Our results allow the interactive exploration of the EC hierarchy for known enzyme families as well as putative enzyme sequences that may need to be classified within the EC hierarchy. These aspects of our framework have been incorporated into a web-server, called CORRIE, which stands for Correspondence Indicator Estimation and allows the interactive prediction of a functional class for putative enzymes from sequence alone, supported by probabilistic measures in the context of the pre-calculated Correspondence Indicators of known enzymes with the functional classes of the EC hierarchy. The CORRIE server is available at: http://www.genomes.org/services/corrie/. 相似文献
11.
Dávila AM Lorenzini DM Mendes PN Satake TS Sousa GR Campos LM Mazzoni CJ Wagner G Pires PF Grisard EC Cavalcanti MC Campos ML 《Bioinformatics (Oxford, England)》2005,21(23):4302-4303
SUMMARY: Growth of genome data and analysis possibilities have brought new levels of difficulty for scientists to understand, integrate and deal with all this ever-increasing information. In this scenario, GARSA has been conceived aiming to facilitate the tasks of integrating, analyzing and presenting genomic information from several bioinformatics tools and genomic databases, in a flexible way. GARSA is a user-friendly web-based system designed to analyze genomic data in the context of a pipeline. EST and GGS data can be analyzed using the system since it accepts (1) chromatograms, (2) download of sequences from GenBank, (3) Fasta files stored locally or (4) a combination of all three. Quality evaluation of chromatograms, vector removing and clusterization are easily performed as part of the pipeline. A number of local and customizable Blast and CDD analyses can be performed as well as Interpro, complemented with phylogeny analyses. GARSA is being used for the analyses of Trypanosoma vivax (GSS and EST), Trypanosoma rangeli (GSS, EST and ORESTES), Bothrops jararaca (EST), Piaractus mesopotamicus (EST) and Lutzomyia longipalpis (EST). AVAILABILITY: The GARSA system is freely available under GPL license (http://www.biowebdb.org/garsa/). For download requests visit http://www.biowebdb.org/garsa/ or contact Dr Alberto Dávila. 相似文献
12.
Automated genome sequence analysis and annotation. 总被引:5,自引:0,他引:5
M A Andrade N P Brown C Leroy S Hoersch A de Daruvar C Reich A Franchini J Tamames A Valencia C Ouzounis C Sander 《Bioinformatics (Oxford, England)》1999,15(5):391-412
MOTIVATION: Large-scale genome projects generate a rapidly increasing number of sequences, most of them biochemically uncharacterized. Research in bioinformatics contributes to the development of methods for the computational characterization of these sequences. However, the installation and application of these methods require experience and are time consuming. RESULTS: We present here an automatic system for preliminary functional annotation of protein sequences that has been applied to the analysis of sets of sequences from complete genomes, both to refine overall performance and to make new discoveries comparable to those made by human experts. The GeneQuiz system includes a Web-based browser that allows examination of the evidence leading to an automatic annotation and offers additional information, views of the results, and links to biological databases that complement the automatic analysis. System structure and operating principles concerning the use of multiple sequence databases, underlying sequence analysis tools, lexical analyses of database annotations and decision criteria for functional assignments are detailed. The system makes automatic quality assessments of results based on prior experience with the underlying sequence analysis tools; overall error rates in functional assignment are estimated at 2.5-5% for cases annotated with highest reliability ('clear' cases). Sources of over-interpretation of results are discussed with proposals for improvement. A conservative definition for reporting 'new findings' that takes account of database maturity is presented along with examples of possible kinds of discoveries (new function, family and superfamily) made by the system. System performance in relation to sequence database coverage, database dynamics and database search methods is analysed, demonstrating the inherent advantages of an integrated automatic approach using multiple databases and search methods applied in an objective and repeatable manner. AVAILABILITY: The GeneQuiz system is publicly available for analysis of protein sequences through a Web server at http://www.sander.ebi.ac. uk/gqsrv/submit 相似文献
13.
UniSave: the UniProtKB sequence/annotation version database 总被引:1,自引:0,他引:1
SUMMARY: The UniProtKB Sequence/Annotation Version database (UniSave) is a comprehensive archive of UniProtKB/Swiss-Prot and UniProtKB/TrEMBL entry versions. All changed Swiss-Prot and TrEMBL entries are loaded into the UniSave as part of the public bi-weekly UniProtKB releases. Unlike the UniProtKB, which contains only the latest Swiss-Prot and TrEMBL entry versions, the UniSave provides access to previous versions of these entries. AVAILABILITY: http://www.ebi.ac.uk/uniprot/unisave 相似文献
14.
Gu S Anderson I Kunin V Cipriano M Minovitsky S Weber G Amenta N Hamann B Dubchak I 《Bioinformatics (Oxford, England)》2007,23(6):764-766
We describe a general multiplatform exploratory tool called TreeQ-Vista, designed for presenting functional annotations in a phylogenetic context. Traits, such as phenotypic and genomic properties, are interactively queried from a user-provided relational database with a user-friendly interface which provides a set of tools for users with or without SQL knowledge. The query results are projected onto a phylogenetic tree and can be displayed in multiple color groups. A rich set of browsing, grouping and query tools are provided to facilitate trait exploration, comparison and analysis. AVAILABILITY: The program, detailed tutorial and examples are available online (http:/genome.lbl.gov/vista/TreeQVista). 相似文献
15.
MOTIVATION: To be fully and efficiently exploited, data coming from
sequencing projects together with specific sequence analysis tools need to
be integrated within reliable data management systems. Systems designed to
manage genome data and analysis tend to give a greater importance either to
the data storage or to the methodological aspect, but lack a complete
integration of both components. RESULTS: This paper presents a co-operative
computer environment (called Imagenetrade mark) dedicated to genomic
sequence analysis and annotation. Imagene has been developed by using an
object-based model. Thanks to this representation, the user can directly
manipulate familiar data objects through icons or lists. Imagene also
incorporates a solving engine in order to manage analysis tasks. A global
task is solved by successive divisions into smaller sub-tasks. During
program execution, these sub- tasks are graphically displayed to the user
and may be further re- started at any point after task completion. In this
sense, Imagene is more transparent to the user than a traditional
menu-driven package. Imagene also provides a user interface to display, on
the same screen, the results produced by several tasks, together with the
capability to annotate these results easily. In its current form, Imagene
has been designed particularly for use in microbial sequencing projects.
AVAILABILITY: Imagene best runs on SGI (Irix 6.3 or higher) workstations.
It is distributed free of charge on a CD-ROM, but requires some Ilog
licensed software to run. Some modules also require separate license
agreements. Please contact the authors for specific academic conditions and
other Unix platforms. CONTACT: imagene home page:
http://wwwabi.snv.jussieu.fr/imagene
相似文献
16.
Mount SM 《American journal of human genetics》2000,67(4):788-792
17.
18.
Background
Visualization of sequence annotation is a common feature in many bioinformatics tools. For many applications it is desirable to restrict the display of such annotation according to a score cutoff, as biological interpretation can be difficult in the presence of the entire data. Unfortunately, many visualisation solutions are somewhat static in the way they handle such score cutoffs. 相似文献19.
Kumar K Desai V Cheng L Khitrov M Grover D Satya RV Yu C Zavaljevski N Reifman J 《PloS one》2011,6(3):e17469
BACKGROUND: The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. METHODOLOGY: The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions. 相似文献
20.
Multi-species comparisons of DNA sequences are more powerful for discovering functional sequences than pairwise DNA sequence comparisons. Most current computational tools have been designed for pairwise comparisons, and efficient extension of these tools to multiple species will require knowledge of the ideal evolutionary distance to choose and the development of new algorithms for alignment, analysis of conservation, and visualization of results. 相似文献