首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

The BioMoby project aims to identify and deploy standards and conventions that aid in the discovery, execution, and pipelining of distributed bioinformatics Web Services. As of August, 2006, approximately 680 bioinformatics resources were available through the BioMoby interoperability platform. There are a variety of clients that can interact with BioMoby-style services. Here we describe a Web-based browser-style client – Gbrowse Moby – that allows users to discover and "surf" from one bioinformatics service to the next using a semantically-aided browsing interface.  相似文献   

2.

Background

Over the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts.

Results

In this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure.

Conclusions

Tavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub-) workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and complexity of analysis. The system can be accessed either through a cloud-enabled web-interface or downloaded and installed to run within the user's local environment. All resources related to Tavaxy are available at http://www.tavaxy.org.  相似文献   

3.

Background  

The use of clustering methods for the discovery of cancer subtypes has drawn a great deal of attention in the scientific community. While bioinformaticians have proposed new clustering methods that take advantage of characteristics of the gene expression data, the medical community has a preference for using "classic" clustering methods. There have been no studies thus far performing a large-scale evaluation of different clustering methods in this context.  相似文献   

4.
DataBiNS is a custom-designed BioMoby Web Service workflow that integrates non-synonymous coding single nucleotide polymorphisms (nsSNPs) data with structure/function and pathway data for the relevant protein. A KEGG Pathway Identifier representing a specific human biological pathway initializes the DataBiNS workflow. The workflow retrieves a list of publications, gene ontology annotations and nsSNP information for each gene involved in the biological pathway. Manual inspection of output data from several trial runs confirms that all expected information is appropriately retrieved by the workflow services. The use of an automated BioMoby workflow, rather than manual 'surfing', to retrieve the necessary data, significantly reduces the effort required for functional interpretation of SNP data, and thus encourages more speculative investigation. Moreover, the modular nature of the individual BioMoby Services enables fine-grained reusing of each service in other workflows, thus reducing the effort required to achieve similar investigations in the future. AVAILABILITY: The workflow is freely available as a Taverna SCUFL XML document at the iCAPTURE Centre web site, http://www.mrl.ubc.ca/who/who_bios_scott_tebbutt.shtml.  相似文献   

5.

Background  

R is the preferred tool for statistical analysis of many bioinformaticians due in part to the increasing number of freely available analytical methods. Such methods can be quickly reused and adapted to each particular experiment. However, in experiments where large amounts of data are generated, for example using high-throughput screening devices, the processing time required to analyze data is often quite long. A solution to reduce the processing time is the use of parallel computing technologies. Because R does not support parallel computations, several tools have been developed to enable such technologies. However, these tools require multiple modications to the way R programs are usually written or run. Although these tools can finally speed up the calculations, the time, skills and additional resources required to use them are an obstacle for most bioinformaticians.  相似文献   

6.
7.

Background  

Over the last few years a number of methods have been proposed for the phenotype simulation of microorganisms under different environmental and genetic conditions. These have been used as the basis to support the discovery of successful genetic modifications of the microbial metabolism to address industrial goals. However, the use of these methods has been restricted to bioinformaticians or other expert researchers. The main aim of this work is, therefore, to provide a user-friendly computational tool for Metabolic Engineering applications.  相似文献   

8.
Although there is great promise in the benefits to be obtained by analyzing cancer genomes, numerous challenges hinder different stages of the process, from the problem of sample preparation and the validation of the experimental techniques, to the interpretation of the results. This chapter specifically focuses on the technical issues associated with the bioinformatics analysis of cancer genome data. The main issues addressed are the use of database and software resources, the use of analysis workflows and the presentation of clinically relevant action items. We attempt to aid new developers in the field by describing the different stages of analysis and discussing current approaches, as well as by providing practical advice on how to access and use resources, and how to implement recommendations. Real cases from cancer genome projects are used as examples.

What to Learn in This Chapter

This chapter presents an overview of how cancer genomes can be analyzed, discussing some of the challenges involved and providing practical advice on how to address them. As the primary analysis of experimental data is described elsewhere (sequencing, alignment and variant calling), we will focus on the secondary analysis of the data, i.e., the selection of candidate driver genes, functional interpretation and the presentation of the results. Emphasis is placed on how to build applications that meet the needs of researchers, academics and clinicians. The general features of such applications are laid out, along with advice on their design and implementation. This document should serve as a starter guide for bioinformaticians interested in the analysis of cancer genomes, although we also hope that more experienced bioinformaticians will find interesting solutions to some key technical issues.
This article is part of the “Translational Bioinformatics” collection for PLOS Computational Biology.
  相似文献   

9.

Background  

The reliable dissection of large proteins into structural domains represents an important issue for structural genomics/proteomics projects. To provide a practical approach to this issue, we tested the ability of neural network to identify domain linkers from the SWISSPROT database (101602 sequences).  相似文献   

10.

Background  

Processing raw DNA sequence data is an especially challenging task for relatively small laboratories and core facilities that produce as many as 5000 or more DNA sequences per week from multiple projects in widely differing species. To meet this challenge, we have developed the flexible, scalable, and automated sequence processing package described here.  相似文献   

11.

Background  

Determining the position and order of contigs and scaffolds from a genome assembly within an organism's genome remains a technical challenge in a majority of sequencing projects. In order to exploit contemporary technologies for DNA sequencing, we developed a strategy for whole genome single nucleotide polymorphism sequencing allowing the positioning of sequence contigs onto a linkage map using the bin mapping method.  相似文献   

12.
Summary: Taverna is an application that eases the integrationof tools and databases for life science research by the constructionof workflows. The Taverna Interaction Service extends the functionalityof Taverna by defining human interaction within a workflow andacting as a mediation layer between the automated workflow engineand one or more users. Availability: Taverna, the Interaction Service plug-in and webapplication are available as open source and can be downloadedfrom http://taverna.sourceforge.net/ Contact: taverna-users{at}lists.sourceforge.net Associate Editor: John Quackenbush  相似文献   

13.
14.

Background  

The MAQC project demonstrated that microarrays with comparable content show inter- and intra-platform reproducibility. However, since the content of gene databases still increases, the development of new generations of microarrays covering new content is mandatory. To better understand the potential challenges updated microarray content might pose on clinical and biological projects we developed a methodology consisting of in silico analyses combined with performance analysis using real biological samples.  相似文献   

15.

Background  

High-throughput genotyping and phenotyping projects of large epidemiological study populations require sophisticated laboratory information management systems. Most epidemiological studies include subject-related personal information, which needs to be handled with care by following data privacy protection guidelines. In addition, genotyping core facilities handling cooperative projects require a straightforward solution to monitor the status and financial resources of the different projects.  相似文献   

16.

Background

Next-generation sequencing is making it critical to robustly and rapidly handle genomic ranges within standard pipelines. Standard use-cases include annotating sequence ranges with gene or other genomic annotation, merging multiple experiments together and subsequently quantifying and visualizing the overlap. The most widely-used tools for these tasks work at the command-line (e.g. BEDTools) and the small number of available R packages are either slow or have distinct semantics and features from command-line interfaces.

Results

To provide a robust R-based interface to standard command-line tools for genomic coordinate manipulation, we created bedr. This open-source R package can use either BEDTools or BEDOPS as a back-end and performs data-manipulation extremely quickly, creating R data structures that can be readily interfaced with existing computational pipelines. It includes data-visualization capabilities and a number of data-access functions that interface with standard databases like UCSC and COSMIC.

Conclusions

bedr package provides an open source solution to enable genomic interval data manipulation and restructuring in R programming language which is commonly used in bioinformatics, and therefore would be useful to bioinformaticians and genomic researchers.
  相似文献   

17.

Background  

With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects.  相似文献   

18.

Background  

OMA is a project that aims to identify orthologs within publicly available, complete genomes. With 657 genomes analyzed to date, OMA is one of the largest projects of its kind.  相似文献   

19.

Background  

Large-scale genetic mapping projects require data management systems that can handle complex phenotypes and detect and correct high-throughput genotyping errors, yet are easy to use.  相似文献   

20.
The BioMoby project was initiated in 2001 from within the modelorganism database community. It aimed to standardize methodologiesto facilitate information exchange and access to analyticalresources, using a consensus driven approach. Six years later,the BioMoby development community is pleased to announce therelease of the 1.0 version of the interoperability framework,registry Application Programming Interface and supporting Perland Java code-bases. Together, these provide interoperable accessto over 1400 bioinformatics resources worldwide through theBioMoby platform, and this number continues to grow. Here wehighlight and discuss the features of BioMoby that make it distinctfrom other Semantic Web Service and interoperability initiatives,and that have been instrumental to its deployment and use bya wide community of bioinformatics service providers. The standard,client software, and supporting code libraries are all freelyavailable at http://www.biomoby.org/.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号