首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 262 毫秒
1.
WebACT--an online companion for the Artemis Comparison Tool   总被引:4,自引:0,他引:4  
SUMMARY: WebACT is an online resource which enables the rapid provision of simultaneous BLAST comparisons between up to five genomic sequences in a format amenable for visualization with the well-known Artemis Comparison Tool (ACT). Comparisons can be generated on-the-fly using sequences directly retrieved via EMBL database queries, or by entering or uploading user sequences. Furthermore, pre-computed comparisons are available between all publicly available, completed prokaryotic genomes and plasmids currently contained within the Genome Reviews database (372 sequences, representing 175 different species). The system is designed to minimize the volume of downloaded data and maximize performance. Genome sequences, annotation and pre-computed comparisons are stored in a relational database allowing flexible querying based on user-defined sequence regions, from whole genome to a defined region flanking a specified gene. Comparison and sequence files, whether computed online or retrieved from the database of pre-computed genome comparisons, can be viewed online using ACT and are available for download. AVAILABILITY: Freely accessible at http://www.webact.org. SUPPLEMENTARY INFORMATION: User guide and worked examples are available at http://www.webact.org/WebACT/docs.  相似文献   

2.
The ANDVisio tool is designed to reconstruct and analyze associative gene networks in the earlier developed Associative Network Discovery System (ANDSystem) software package. The ANDSystem incorporates utilities for automated extraction of knowledge from Pubmed published scientific texts, analysis of factographic databases, also the ANDCell database containing information on molecular-genetic events retrieved from texts and databases. ANDVisio is a new user's interface to the ANDCell database stored in a remote server. ANDVisio provides graphic visualization, editing, search, also saving of associative gene networks in different formats resulting from user's request. The associative gene networks describe semantic relationships between molecular-genetic objects (proteins, genes, metabolites and others), biological processes, and diseases. ANDVisio is provided with various tools to support filtering by object types, relationships between objects and information sources; graph layout; search of the shortest pathway; cycles in graphs.  相似文献   

3.
4.
The genome sequence database (GSDB) is a complete, publicly available relational database of DNA sequences and annotation maintained by the National Center for Genome Resources (NCGR) under a Cooperative Agreement with the US Department of Energy (DOE). GSDB provides direct, client- server access to the database for data contributions, community annotation and SQL queries. The GSDB Annotator, a multi-platform graphic user interface, is freely available. Automatically updated relational replicates of GSDB are also freely available.  相似文献   

5.
Functional context for biological sequence is provided in the form of annotations. However, within a group of similar sequences there can be annotation heterogeneity in terms of coverage and specificity. This in turn can introduce issues regarding the interpretation of actual functional similarity and overall functional coherence of such a group. One way to mitigate such issues is through the use of visualization and statistical techniques. Therefore, in order to help interpret this annotation heterogeneity we created a web application that generates Gene Ontology annotation graphs for protein sets and their associated statistics from simple frequencies to enrichment values and Information Content based metrics. The publicly accessible website http://xldb.di.fc.ul.pt/gryfun/ currently accepts lists of UniProt accession numbers in order to create user-defined protein sets for subsequent annotation visualization and statistical assessment. GRYFUN is a freely available web application that allows GO annotation visualization of protein sets and which can be used for annotation coherence and cohesiveness analysis and annotation extension assessments within under-annotated protein sets.  相似文献   

6.
The annotation of noncoding RNA genes remains a major bottleneck in genome sequencing projects. Most genome sequences released today still come with sets of tRNAs and rRNAs as the only annotated RNA elements, ignoring hundreds of other RNA families. We have developed a web environment that is dedicated to noncoding RNA (ncRNA) prediction, annotation, and analysis and allows users to run a variety of tools in an integrated and flexible manner. This environment offers complementary ncRNA gene finders and a set of tools for the comparison, visualization, editing, and export of ncRNA candidates. Predictions can be filtered according to a large set of characteristics. Based on this environment, we created a public website located at http://RNAspace.org. It accepts genomic sequences up to 5 Mb, which permits for an online annotation of a complete bacterial genome or a small eukaryotic chromosome. The project is hosted as a Source Forge project (http://rnaspace.sourceforge.net/).  相似文献   

7.
Although considerable progress has been made in dissecting the signaling pathways involved in the innate immune response, it is now apparent that this response can no longer be productively thought of in terms of simple linear pathways. InnateDB ( www.innatedb.ca ) has been developed to facilitate systems‐level analyses that will provide better insight into the complex networks of pathways and interactions that govern the innate immune response. InnateDB is a publicly available, manually curated, integrative biology database of the human and mouse molecules, experimentally verified interactions and pathways involved in innate immunity, along with centralized annotation on the broader human and mouse interactomes. To date, more than 3500 innate immunity‐relevant interactions have been contextually annotated through the review of 1000 plus publications. Integrated into InnateDB are novel bioinformatics resources, including network visualization software, pathway analysis, orthologous interaction network construction and the ability to overlay user‐supplied gene expression data in an intuitively displayed molecular interaction network and pathway context, which will enable biologists without a computational background to explore their data in a more systems‐oriented manner.  相似文献   

8.
SUMMARY: MuSeqBox is a program to parse BLAST output and store attributes of BLAST hits in tabular form. The user can apply a number of selection criteria to filter out hits with particular attributes. MuSeqBox provides a powerful annotation tool for large sets of query sequences that are simultaneously compared against a database with any of the standard stand-alone or network-client BLAST programs. We discuss such application to the problem of annotation and analysis of EST collections. AVAILABILITY: The program was written in standard C++ and is freely available to noncommercial users by request from the authors. The program is also available over the web at http://bioinformatics.iastate.edu/bioinformatics2go/mb/MuSeqBox.html.  相似文献   

9.
Many problems in analytical biology, such as the classification of organisms, the modelling of macromolecules, or the structural analysis of metabolic or neural networks, involve complex relational data. Here, we describe a software environment, the portable UNIX programming system (PUPS), which has been developed to allow efficient computational representation and analysis of such data. The system can also be used as a general development tool for database and classification applications. As the complexity of analytical biology problems may lead to computation times of several days or weeks even on powerful computer hardware, the PUPS environment gives support for persistent computations by providing mechanisms for dynamic interaction and homeostatic protection of processes. Biological objects and their interrelations are also represented in a homeostatic way in PUPS. Object relationships are maintained and updated by the objects themselves, thus providing a flexible, scalable and current data representation. Based on the PUPS environment, we have developed an optimization package, CANTOR, which can be applied to a wide range of relational data and which has been employed in different analyses of neuroanatomical connectivity. The CANTOR package makes use of the PUPS system features by modifying candidate arrangements of objects within the system's database. This restructuring is carried out via optimization algorithms that are based on user-defined cost functions, thus providing flexible and powerful tools for the structural analysis of the database content. The use of stochastic optimization also enables the CANTOR system to deal effectively with incomplete and inconsistent data. Prototypical forms of PUPS and CANTOR have been coded and used successfully in the analysis of anatomical and functional mammalian brain connectivity, involving complex and inconsistent experimental data. In addition, PUPS has been used for solving multivariate engineering optimization problems and to implement the digital identification system (DAISY), a system for the automated classification of biological objects. PUPS is implemented in ANSI-C under the POSIX.1 standard and is to a great extent architecture- and operating-system independent. The software is supported by systems libraries that allow multi-threading (the concurrent processing of several database operations), as well as the distribution of the dynamic data objects and library operations over clusters of computers. These attributes make the system easily scalable, and in principle allow the representation and analysis of arbitrarily large sets of relational data. PUPS and CANTOR are freely distributed (http://www.pups.org.uk) as open-source software under the GNU license agreement.  相似文献   

10.
MOTIVATION: The Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway database is a very valuable information resource for researchers in the fields of life sciences. It contains metabolic and regulatory processes in the form of wiring diagrams, which can be used for browsing and information retrieval as well as a base for modeling and simulation. Thus it helps in understanding biological processes and higher-order functions of biological systems. Currently the KEGG website uses semi-static visualizations for the presentation and navigation of its pathway information. While this visualization style offers a good pathway presentation and navigation, it does not provide some of the possibilities related to dynamic visualizations, most importantly, the creation and visualization of user-specific pathways. RESULTS: This paper presents methods for the dynamic visualization, interactive navigation and editing of KEGG pathway diagrams. These diagrams, given as KEGG Markup Language (KGML) files, can be visually explored using novel approaches combining semi-static and dynamic visualization, but also edited or even newly created and then exported into KGML files. AVAILABILITY: KGML-ED, a program implementing the presented methods, is available free of charge to the scientific community at http://kgml-ed.ipk-gatersleben.de.  相似文献   

11.
Scansite identifies short protein sequence motifs that are recognized by modular signaling domains, phosphorylated by protein Ser/Thr- or Tyr-kinases or mediate specific interactions with protein or phospholipid ligands. Each sequence motif is represented as a position-specific scoring matrix (PSSM) based on results from oriented peptide library and phage display experiments. Predicted domain-motif interactions from Scansite can be sequentially combined, allowing segments of biological pathways to be constructed in silico. The current release of Scansite, version 2.0, includes 62 motifs characterizing the binding and/or substrate specificities of many families of Ser/Thr- or Tyr-kinases, SH2, SH3, PDZ, 14-3-3 and PTB domains, together with signature motifs for PtdIns(3,4,5)P(3)-specific PH domains. Scansite 2.0 contains significant improvements to its original interface, including a number of new generalized user features and significantly enhanced performance. Searches of all SWISS-PROT, TrEMBL, Genpept and Ensembl protein database entries are now possible with run times reduced by approximately 60% when compared with Scansite version 1.0. Scansite 2.0 allows restricted searching of species-specific proteins, as well as isoelectric point and molecular weight sorting to facilitate comparison of predictions with results from two-dimensional gel electrophoresis experiments. Support for user-defined motifs has been increased, allowing easier input of user-defined matrices and permitting user-defined motifs to be combined with pre-compiled Scansite motifs for dual motif searching. In addition, a new series of Sequence Match programs for non-quantitative user-defined motifs has been implemented. Scansite is available via the World Wide Web at http://scansite.mit.edu.  相似文献   

12.
Mayday is a workbench for visualization, analysis and storage of microarray data. It features a graphical user interface and supports the development and integration of existing and new analysis methods. Besides the infrastructural core functionality, Mayday offers a variety of plug-ins, such as various interactive viewers, a connection to the R statistical environment, a connection to SQL-based databases and different data mining methods, including WEKA-library based methods for classification and various clustering methods. In addition, so-called meta information objects are provided for annotation of the microarray data allowing integration of data from different sources, which is a feature that, for instance, is employed in the enhanced heatmap visualization. Supplementary information: The software and more detailed information including screenshots and a user guide as well as test data can be found on the Mayday home page http://www.zbit.uni-tuebingen.de/pas/mayday. The core is published under the GPL (GNU Public License) and the associated plug-ins under the LGPL (Lesser GNU Public License).  相似文献   

13.
The program package ‘ClustScan’ (Cluster Scanner) is designed for rapid, semi-automatic, annotation of DNA sequences encoding modular biosynthetic enzymes including polyketide synthases (PKS), non-ribosomal peptide synthetases (NRPS) and hybrid (PKS/NRPS) enzymes. The program displays the predicted chemical structures of products as well as allowing export of the structures in a standard format for analyses with other programs. Recent advances in understanding of enzyme function are incorporated to make knowledge-based predictions about the stereochemistry of products. The program structure allows easy incorporation of additional knowledge about domain specificities and function. The results of analyses are presented to the user in a graphical interface, which also allows easy editing of the predictions to incorporate user experience. The versatility of this program package has been demonstrated by annotating biochemical pathways in microbial, invertebrate animal and metagenomic datasets. The speed and convenience of the package allows the annotation of all PKS and NRPS clusters in a complete Actinobacteria genome in 2–3 man hours. The open architecture of ClustScan allows easy integration with other programs, facilitating further analyses of results, which is useful for a broad range of researchers in the chemical and biological sciences.  相似文献   

14.

Background

The expressed sequence tag (EST) methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways.

Results

annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO), Enzyme Commission (EC) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools.

Conclusion

annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non-model species EST-sequencing projects.  相似文献   

15.
SUMMARY: With increasing numbers of eukaryotic genome sequences, phylogenetic profiles of eukaryotic genes are becoming increasingly informative. Here, we introduce a new web-tool Phylopro (http://compsysbio.org/phylopro/), which uses the 120 available eukaryotic genome sequences to visualize the evolutionary trajectories of user-defined subsets of model organism genes. Applied to pathways or complexes, PhyloPro allows the user to rapidly identify core conserved elements of biological processes together with those that may represent lineage-specific innovations. PhyloPro thus provides a valuable resource for the evolutionary and comparative studies of biological systems.  相似文献   

16.
A common biological pathway reconstruction approach—as implemented by many automatic biological pathway services (such as the KAAS and RAST servers) and the functional annotation of metagenomic sequences—starts with the identification of protein functions or families (e.g., KO families for the KEGG database and the FIG families for the SEED database) in the query sequences, followed by a direct mapping of the identified protein families onto pathways. Given a predicted patchwork of individual biochemical steps, some metric must be applied in deciding what pathways actually exist in the genome or metagenome represented by the sequences. Commonly, and straightforwardly, a complete biological pathway can be identified in a dataset if at least one of the steps associated with the pathway is found. We report, however, that this naïve mapping approach leads to an inflated estimate of biological pathways, and thus overestimates the functional diversity of the sample from which the DNA sequences are derived. We developed a parsimony approach, called MinPath (Minimal set of Pathways), for biological pathway reconstructions using protein family predictions, which yields a more conservative, yet more faithful, estimation of the biological pathways for a query dataset. MinPath identified far fewer pathways for the genomes collected in the KEGG database—as compared to the naïve mapping approach—eliminating some obviously spurious pathway annotations. Results from applying MinPath to several metagenomes indicate that the common methods used for metagenome annotation may significantly overestimate the biological pathways encoded by microbial communities.  相似文献   

17.
Abuse of drugs can elicit compulsive drug seeking behaviors upon repeated administration, and ultimately leads to the phenomenon of addiction. We developed a procedure for the standardization of microarray gene expression data of rat brain in drug addiction and stored them in a single integrated database system, focusing on more effective data processing and interpretation. Another characteristic of the present database is that it has a systematic flexibility for statistical analysis and linking with other databases. Basically, we adopt an intelligent SQL querying system, as the foundation of our DB, in order to set up an interactive module which can automatically read the raw gene expression data in the standardized format. We maximize the usability of this DB, helping users study significant gene expression and identify biological function of the genes through integrated up-to-date gene information such as GO annotation and metabolic pathway. For collecting the latest information of selected gene from the database, we also set up the local BLAST search engine and nonredundant sequence database updated by NCBI server on a daily basis. We find that the present database is a useful query interface and data-mining tool, specifically for finding out the genes related to drug addiction. We apply this system to the identification and characterization of methamphetamine-induced genes' behavior in rat brain.  相似文献   

18.
19.
Associations among biological objects such as genes, proteins, and drugs can be discovered automatically from the scientific literature. TransMiner is a system for finding associations among objects by mining the Medline database of the scientific literature. The direct associations among the objects are discovered based on the principle of co-occurrence in the form of an association graph. The principle of transitive closure is applied to the association graph to find potential transitive associations. The potential transitive associations that are indeed direct are discovered by iterative retrieval and mining of the Medline documents. Those associations that are not found explicitly in the entire Medline database are transitive associations and are the candidates for hypothesis generation. The transitive associations were ranked based on the sum of weight of terms that cooccur with both the objects. The direct and transitive associations are visualized using a graph visualization applet. TransMiner was tested by finding associations among 56 breast cancer genes and among 24 objects in the calpain signal transduction pathway. TransMiner was also used to rediscover associations between magnesium and migraine.  相似文献   

20.
Though biomedical research often draws on knowledge from a wide variety of fields, few visualization methods for biomedical data incorporate meaningful cross-database exploration. A new approach is offered for visualizing and exploring a query-based subset of multiple heterogeneous biomedical databases. Databases are modeled as an entity-relation graph containing nodes (database records) and links (relationships between records). Users specify a keyword search string to retrieve an initial set of nodes, and then explore intra- and interdatabase links. Results are visualized with user-defined semantic substrates to take advantage of the rich set of attributes usually present in biomedical data. Comments from domain experts indicate that this visualization method is potentially advantageous for biomedical knowledge exploration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号