首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The Dundee Resource for Sequence Analysis and Structure Prediction (DRSASP; http://www.compbio.dundee.ac.uk/drsasp.html ) is a collection of web services provided by the Barton Group at the University of Dundee. DRSASP's flagship services are the JPred4 webserver for secondary structure and solvent accessibility prediction and the JABAWS 2.2 webserver for multiple sequence alignment, disorder prediction, amino acid conservation calculations, and specificity‐determining site prediction. DRSASP resources are available through conventional web interfaces and APIs but are also integrated into the Jalview sequence analysis workbench, which enables the composition of multitool interactive workflows. Other existing Barton Group tools are being brought under the banner of DRSASP, including NoD (Nucleolar localization sequence detector) and 14‐3‐3‐Pred. New resources are being developed that enable the analysis of population genetic data in evolutionary and 3D structural contexts. Existing resources are actively developed to exploit new technologies and maintain parity with evolving web standards. DRSASP provides substantial computational resources for public use, and since 2016 DRSASP services have completed over 1.5 million jobs.  相似文献   

2.
3.
SUMMARY: TreeMos is a novel high-throughput graphical analysis application that allows the user to search for phylogenetic mosaicism among one or more DNA or protein sequence multiple alignments and additional unaligned sequences. TreeMos uses a sliding window and local alignment algorithm to identify the nearest neighbour of each sequence segment, and visualizes instances of sequence segments whose nearest neighbour is anomalous to that identified using the global alignment. Data sets can include whole genome sequences allowing phylogenomic analyses in which mosaicism may be attributed to recombination between any two points in the genome. TreeMos can be run from the command line, or within a web browser allowing the relationships between taxa to be explored by drill-through. AVAILABILITY: http://www2.warwick.ac.uk/fac/sci/whri/research/archaeobotany.  相似文献   

4.

Background  

Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all positions. However, when we deal with proteins in the "twilight zone" we can observe that only some segments of sequences (motifs) are conserved. We introduce a novel logical representation that allows us to represent physico-chemical properties of sequences, conserved amino acid positions and conserved physico-chemical positions in the MSA. From this, Inductive Logic Programming (ILP) finds the most frequent patterns (motifs) and uses them to train propositional models, such as decision trees and support vector machines (SVM).  相似文献   

5.
6.
Visually examining RNA structures can greatly aid in understanding their potential functional roles and in evaluating the performance of structure prediction algorithms. As many functional roles of RNA structures can already be studied given the secondary structure of the RNA, various methods have been devised for visualizing RNA secondary structures. Most of these methods depict a given RNA secondary structure as a planar graph consisting of base-paired stems interconnected by roundish loops. In this article, we present an alternative method of depicting RNA secondary structure as arc diagrams. This is well suited for structures that are difficult or impossible to represent as planar stem-loop diagrams. Arc diagrams can intuitively display pseudo-knotted structures, as well as transient and alternative structural features. In addition, they facilitate the comparison of known and predicted RNA secondary structures. An added benefit is that structure information can be displayed in conjunction with a corresponding multiple sequence alignments, thereby highlighting structure and primary sequence conservation and variation. We have implemented the visualization algorithm as a web server R-chie as well as a corresponding R package called R4RNA, which allows users to run the software locally and across a range of common operating systems.  相似文献   

7.
TMpro is a transmembrane (TM) helix prediction algorithm that uses language processing methodology for TM segment identification. It is primarily based on the analysis of statistical distributions of properties of amino acids in transmembrane segments. This article describes the availability of TMpro on the internet via a web interface. The key features of the interface are: (i) output is generated in multiple formats including a user-interactive graphical chart which allows comparison of TMpro predicted segment locations with other labeled segments input by the user, such as predictions from other methods. (ii) Up to 5000 sequences can be submitted at a time for prediction. (iii) TMpro is available as a web server and is published as a web service so that the method can be accessed by users as well as other services depending on the need for data integration. Availability: http://linzer.blm.cs.cmu.edu/tmpro/ (web server and help), http://blm.sis.pitt.edu:8080/axis/services/TMProFetcherService (web service).  相似文献   

8.
Workflow Information Storage Toolkit (WIST) is a set of application programming interfaces and web applications that allow for the rapid development of customized laboratory information management systems (LIMS). WIST provides common LIMS input components, and allows them to be arranged and configured using a flexible language that specifies each component's visual and semantic characteristics. WIST includes a complete set of web applications for adding, editing and viewing data, as well as a powerful setup tool that can build new LIMS modules by analyzing existing database schema. Availability and implementation: WIST is implemented in Perl and may be obtained from http://vimss.sf.net under the BSD license.  相似文献   

9.
EMBnet is a consortium of collaborating bioinformatics groups located mainly within Europe (http://www.embnet.org). Each member country is represented by a 'node', a group responsible for the maintenance of local services for their users (e.g. education, training, software, database distribution, technical support, helpdesk). Among these services a web portal with links and access to locally developed and maintained software is essential and different for each node. Our web portal targets biomedical scientists in Switzerland and elsewhere, offering them access to a collection of important sequence analysis tools mirrored from other sites or developed locally. We describe here the Swiss EMBnet node web site (http://www.ch.embnet.org), which presents a number of original services not available anywhere else.  相似文献   

10.
11.
INCLUSive is a suite of algorithms and tools for the analysis of gene expression data and the discovery of cis-regulatory sequence elements. The tools allow normalization, filtering and clustering of microarray data, functional scoring of gene clusters, sequence retrieval, and detection of known and unknown regulatory elements using probabilistic sequence models and Gibbs sampling. All tools are available via different web pages and as web services. The web pages are connected and integrated to reflect a methodology and facilitate complex analysis using different tools. The web services can be invoked using standard SOAP messaging. Example clients are available for download to invoke the services from a remote computer or to be integrated with other applications. All services are catalogued and described in a web service registry. The INCLUSive web portal is available for academic purposes at http://www.esat.kuleuven.ac.be/inclusive.  相似文献   

12.
Until now the most efficient solution to align nucleotide sequences containing open reading frames was to use indirect procedures that align amino acid translation before reporting the inferred gap positions at the codon level. There are two important pitfalls with this approach. Firstly, any premature stop codon impedes using such a strategy. Secondly, each sequence is translated with the same reading frame from beginning to end, so that the presence of a single additional nucleotide leads to both aberrant translation and alignment.We present an algorithm that has the same space and time complexity as the classical Needleman-Wunsch algorithm while accommodating sequencing errors and other biological deviations from the coding frame. The resulting pairwise coding sequence alignment method was extended to a multiple sequence alignment (MSA) algorithm implemented in a program called MACSE (Multiple Alignment of Coding SEquences accounting for frameshifts and stop codons). MACSE is the first automatic solution to align protein-coding gene datasets containing non-functional sequences (pseudogenes) without disrupting the underlying codon structure. It has also proved useful in detecting undocumented frameshifts in public database sequences and in aligning next-generation sequencing reads/contigs against a reference coding sequence.MACSE is distributed as an open-source java file executable with freely available source code and can be used via a web interface at: http://mbb.univ-montp2.fr/macse.  相似文献   

13.
The Jackson Laboratory Colony Management System (JCMS) is a software application for managing data and information related to research mouse colonies, associated biospecimens, and experimental protocols. JCMS runs directly on computers that run one of the PC Windows® operating systems, but can be accessed via web browser interfaces from any computer running a Windows, Macintosh®, or Linux® operating system. JCMS can be configured for a single user or multiple users in small- to medium-size work groups. The target audience for JCMS includes laboratory technicians, animal colony managers, and principal investigators. The application provides operational support for colony management and experimental workflows, sample and data tracking through transaction-based data entry forms, and date-driven work reports. Flexible query forms allow researchers to retrieve database records based on user-defined criteria. Recent advances in handheld computers with integrated barcode readers, middleware technologies, web browsers, and wireless networks add to the utility of JCMS by allowing real-time access to the database from any networked computer.  相似文献   

14.
An important task of computational biology is to identify those parts of a polypeptide chain, which are involved in interactions with other proteins. For this purpose, we have developed the program PresCont, which predicts in a robust manner amino acids that constitute protein-protein interfaces (PPIs). PresCont reaches state-of-the-art classification quality on the basis of only four residue properties that can be readily deduced from the 3D structure of an individual protein and a multiple sequence alignment (MSA) composed of homologs. The core of PresCont is a support vector machine, which assesses solvent-accessible surface area, hydrophobicity, conservation, and the local environment of each amino acid on the protein surface. For training and performance testing, we compiled three nonoverlapping datasets consisting of permanently formed or transient complexes, respectively. A comparison with SPPIDER, ProMate, and meta-PPISP showed that PresCont compares favorably with these highly sophisticated programs, and that its prediction quality is less dependent on the type of protein complex being considered. This balance is due to a mutual compensation of classification weaknesses observed for individual properties: For PPIs of permanent complexes, solvent-accessible surface and hydrophobicity contribute most to classification quality, for PPIs of transient complexes, the assessment of the local environment is most significant. Moreover, we show that for permanent complexes a segmentation of PPIs into core and rim residues has only a moderate influence on prediction quality. PresCont is available as a web service at http://www-bioinf.uni-regensburg.de/.  相似文献   

15.
Surface proteins, such as those located in the cell wall of fungi, play an important role in the interaction with the surrounding environment. For instance, they mediate primary host-pathogen interactions and are crucial to the establishment of biofilms and fungal infections. Surface localization of proteins is determined by specific sequence features and can be predicted by combining different freely available web servers. However, user-friendly tools that allow rapid analysis of large datasets (whole proteomes or larger) in subsequent analyses were not yet available. Here, we present the web tool ProFASTA, which integrates multiple tools for rapid scanning of protein sequence properties in large datasets and returns sequences in FASTA format. ProFASTA also allows for pipeline filtering of proteins with cell surface characteristics by analysis of the output created with SignalP, TMHMM and big-PI. In addition, it provides keyword, iso-electric point, composition and pattern scanning. Furthermore, ProFASTA contains all fungal protein sequences present in the NCBI Protein database. As the full fungal NCBI Taxonomy is included, sequence subsets can be selected by supplying a taxon name. The usefulness of ProFASTA is demonstrated here with a few examples; in the recent past, ProFASTA has already been applied successfully to the annotation of covalently-bound fungal wall proteins as part of community-wide genome annotation programs. ProFASTA is available at: http://www.bioinformatics.nl/tools/profasta/.  相似文献   

16.
Multiple sequence alignment using partial order graphs   总被引:14,自引:0,他引:14  
MOTIVATION: Progressive Multiple Sequence Alignment (MSA) methods depend on reducing an MSA to a linear profile for each alignment step. However, this leads to loss of information needed for accurate alignment, and gap scoring artifacts. RESULTS: We present a graph representation of an MSA that can itself be aligned directly by pairwise dynamic programming, eliminating the need to reduce the MSA to a profile. This enables our algorithm (Partial Order Alignment (POA)) to guarantee that the optimal alignment of each new sequence versus each sequence in the MSA will be considered. Moreover, this algorithm introduces a new edit operator, homologous recombination, important for multidomain sequences. The algorithm has improved speed (linear time complexity) over existing MSA algorithms, enabling construction of massive and complex alignments (e.g. an alignment of 5000 sequences in 4 h on a Pentium II). We demonstrate the utility of this algorithm on a family of multidomain SH2 proteins, and on EST assemblies containing alternative splicing and polymorphism. AVAILABILITY: The partial order alignment program POA is available at http://www.bioinformatics.ucla.edu/poa.  相似文献   

17.
Ten years of experience with molecular class–specific information systems (MCSIS) such as with the hand‐curated G protein–coupled receptor database (GPCRDB) or the semiautomatically generated nuclear receptor database has made clear that a wide variety of questions can be answered when protein‐related data from many different origins can be flexibly combined. MCSISes revolve around a multiple sequence alignment (MSA) that includes “all” available sequences from the entire superfamily, and it has been shown at many occasions that the quality of these alignments is the most crucial aspect of the MCSIS approach. We describe here a system called 3DM that can automatically build an entire MCSIS. 3DM bases the MSA on a multiple structure alignment, which implies that the availability of a large number of superfamily members with a known three‐dimensional structure is a requirement for 3DM to succeed well. Thirteen MCSISes were constructed and placed on the Internet for examination. These systems have been instrumental in a large series of research projects related to enzyme activity or the understanding and engineering of specificity, protein stability engineering, DNA‐diagnostics, drug design, and so forth. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

18.
The European Bioinformatics Institute (EBI) provides numerous free-of-charge, publicly available bioinformatics services that can be divided into the following categories: ftp downloads; data submissions processing and biological database production; access to query; analysis and retrieval systems and tools; user support; training and education and industry support through EBI's SME program. These services are all available at the website. It is imperative that EBI's data as well as the tools to analyse it efficiently are made available in a free and unambiguous way to the scientific community. An important part of the EBI's mission is to make this happen in a fast, reliable and efficient manner. This paper serves as a brief introduction to each of these services.  相似文献   

19.
We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. Availability and implementation: JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.  相似文献   

20.
Multiple sequence alignments (MSAs) have become one of the most studied approaches in bioinformatics to perform other outstanding tasks such as structure prediction, biological function analysis or next-generation sequencing. However, current MSA algorithms do not always provide consistent solutions, since alignments become increasingly difficult when dealing with low similarity sequences. As widely known, these algorithms directly depend on specific features of the sequences, causing relevant influence on the alignment accuracy. Many MSA tools have been recently designed but it is not possible to know in advance which one is the most suitable for a particular set of sequences. In this work, we analyze some of the most used algorithms presented in the bibliography and their dependences on several features. A novel intelligent algorithm based on least square support vector machine is then developed to predict how accurate each alignment could be, depending on its analyzed features. This algorithm is performed with a dataset of 2180 MSAs. The proposed system first estimates the accuracy of possible alignments. The most promising methodologies are then selected in order to align each set of sequences. Since only one selected algorithm is run, the computational time is not excessively increased.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号