首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
SUMMARY: The Pfaat protein family alignment annotation tool is a Java-based multiple sequence alignment editor and viewer designed for protein family analysis. The application merges display features such as dendrograms, secondary and tertiary protein structure with SRS retrieval, subgroup comparison, and extensive user-annotation capabilities. AVAILABILITY: The program and source code are freely available from the authors under the GNU General Public License at http://www.pfizerdtc.com  相似文献   

3.
SUMMARY: SQUINT is a sequence alignment tool, and combines both automated progressive sequence alignment with facilities for manual editing. The program imports nucleotide or amino acid sequence multiple alignment files in standard formats, and permits users to view two translations of the same multiple alignment simultaneously. Edits in one view are instantaneously reflected in the other, and the scoring cost of the changes are shown in real-time. Progressive multiple alignments, using a variety of alignment parameters, can be performed on any block of sequences, including blocks embedded in the existing alignment. AVAILABILITY: The software is freely available for download at http://www.cebl.auckland.ac.nz  相似文献   

4.
Present proteomic studies increasingly address experimental strategies focused on multiple comparisons of proteomic profiles. To accomplish semiautomatic protein separations based on 2-D LC, the Beckman Coulter PF2D has been developed. Here, we present a novel general purpose tool called MPA (multiple peak alignment) able to perform multiple comparisons of proteomic profiles both in a pairwise guided fashion and in a fully automatic mode using a strategy based on dynamic programing and progressive alignment of time series. The tool is available at http://grup.cribi.unipd.it/people/stefano/mpa/.  相似文献   

5.
MOTIVATION: A tool that simultaneously aligns multiple protein sequences, automatically utilizes information about protein domains, and has a good compromise between speed and accuracy will have practical advantages over current tools. RESULTS: We describe COBALT, a constraint based alignment tool that implements a general framework for multiple alignment of protein sequences. COBALT finds a collection of pairwise constraints derived from database searches, sequence similarity and user input, combines these pairwise constraints, and then incorporates them into a progressive multiple alignment. We show that using constraints derived from the conserved domain database (CDD) and PROSITE protein-motif database improves COBALT's alignment quality. We also show that COBALT has reasonable runtime performance and alignment accuracy comparable to or exceeding that of other tools for a broad range of problems. AVAILABILITY: COBALT is included in the NCBI C++ toolkit. A Linux executable for COBALT, and CDD and PROSITE data used is available at: ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/cobalt  相似文献   

6.
MOTIVATION: The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. RESULTS: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a framework based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. AVAILABILITY: Phylo-VISTA is available at http://www-gsd.lbl.gov/phylovista. It requires an Internet browser with Java Plug-in 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu  相似文献   

7.
《Genomics》2020,112(6):4561-4566
BackgroundBioinformatics tools are of great significance and are used in different spheres of life sciences. There are wide variety of tools available to perform primary analysis of DNA and protein but most of them are available on different platforms and many remain undetected. Accessing these tools separately to perform individual task is uneconomical and inefficient.ObjectiveOur aim is to bring different bioinformatics models on a single platform to ameliorate scientific research. Hence, our objective is to make a tool for comprehensive DNA and protein analysis.MethodsTo develop a reliable, straight-forward and standalone desktop application we used state of the art python packages and libraries. Bioinformatics Mini Toolbox (BMT) is combination of seven tools including FastqTrimmer, Gene Prediction, DNA Analysis, Translation, Protein analysis and Pairwise and Multiple alignment.ResultsFastqTrimmer assists in quality assurance of NGS data. Gene prediction predicts the genes by homology from novel genome on the basis of reference sequence. Protein analysis and DNA analysis calculates physiochemical properties of nucleotide and protein sequences, respectively. Translation translates the DNA sequence into six open reading frames. Pairwise alignment performs pairwise global and local alignment of DNA and protein sequences on the basis or multiple matrices. Multiple alignment aligns multiple sequences and generates a phylogenetic tree.ConclusionWe developed a tool for comprehensive DNA and protein analysis. The link to download BMT is https://github.com/nasiriqbal012/BMT_SETUP.git  相似文献   

8.
MOTIVATION: To facilitate the process of structure prediction by both comparative modeling and fold recognition, we describe DINAMO, an interactive protein alignment building and model evaluation tool that dynamically couples a multiple sequence alignment editor to a molecular graphics display. DINAMO allows the user to optimize the alignment and model to satisfy the known heuristics of protein structure by means of a set of analysis tools. The analysis tools return information to both the alignment editor and graphics model in the form of visual cues (color, shape), allowing for rapid evaluation. Several analysis tools may be employed, including residue conservation, residue properties (charge, hydrophobicity, volume), residue environmental preference, and secondary structure propensity. RESULTS: We demonstrate DINAMO by building a model for submission in the 3rd annual Critical Assessment of Techniques for Protein Structure Prediction (CASP3) contest. AVAILABILITY: DINAMO is freely available as a local application or Web-based Java applet at http://tito.ucsc.edu/dinamo  相似文献   

9.
RALEE--RNA ALignment editor in Emacs   总被引:5,自引:0,他引:5  
SUMMARY: Production of high quality multiple sequence alignments of structured RNAs relies on an iterative combination of manual editing and structure prediction. An essential feature of an RNA alignment editor is the facility to mark-up the alignment based on how it matches a given secondary structure prediction, but few available alignment editors offer such a feature. The RALEE (RNA ALignment Editor in Emacs) tool provides a simple environment for RNA multiple sequence alignment editing, including structure-specific colour schemes, utilizing helper applications for structure prediction and many more conventional editing functions. This is accomplished by extending the commonly used text editor, Emacs, which is available for Linux, most UNIX systems, Windows and Mac OS. AVAILABILITY: The ELISP source code for RALEE is freely available from http://www.sanger.ac.uk/Users/sgj/ralee/ along with documentation and examples. CONTACT: sgj@sanger.ac.uk  相似文献   

10.
MSAT     
This article describes the development of a new method for multiple sequence alignment based on fold-level protein structure alignments, which provides an improvement in accuracy compared with the most commonly used sequence-only-based techniques. This method integrates the widely used, progressive multiple sequence alignment approach ClustalW with the Topology of Protein Structure (TOPS) topology-based alignment algorithm. The TOPS approach produces a structural alignment for the input protein set by using a topology-based pattern discovery program, providing a set of matched sequence regions that can be used to guide a sequence alignment using ClustalW. The resulting alignments are more reliable than a sequence-only alignment, as determined by 20-fold cross-validation with a set of 106 protein examples from the CATH database, distributed in seven superfold families. The method is particularly effective for sets of proteins that have similar structures at the fold level but low sequence identity. The aim of this research is to contribute towards bridging the gap between protein sequence and structure analysis, in the hope that this can be used to assist the understanding of the relationship between sequence, structure and function. The tool is available at http://balabio.dcs.gla.ac.uk/msat/.  相似文献   

11.
BCL::Align is a multiple sequence alignment tool that utilizes the dynamic programming method in combination with a customizable scoring function for sequence alignment and fold recognition. The scoring function is a weighted sum of the traditional PAM and BLOSUM scoring matrices, position-specific scoring matrices output by PSI-BLAST, secondary structure predicted by a variety of methods, chemical properties, and gap penalties. By adjusting the weights, the method can be tailored for fold recognition or sequence alignment tasks at different levels of sequence identity. A Monte Carlo algorithm was used to determine optimized weight sets for sequence alignment and fold recognition that most accurately reproduced the SABmark reference alignment test set. In an evaluation of sequence alignment performance, BCL::Align ranked best in alignment accuracy (Cline score of 22.90 for sequences in the Twilight Zone) when compared with Align-m, ClustalW, T-Coffee, and MUSCLE. ROC curve analysis indicates BCL::Align's ability to correctly recognize protein folds with over 80% accuracy. The flexibility of the program allows it to be optimized for specific classes of proteins (e.g. membrane proteins) or fold families (e.g. TIM-barrel proteins). BCL::Align is free for academic use and available online at http://www.meilerlab.org/.  相似文献   

12.
Dong E  Smith J  Heinze S  Alexander N  Meiler J 《Gene》2008,422(1-2):41-46
BCL::Align is a multiple sequence alignment tool that utilizes the dynamic programming method in combination with a customizable scoring function for sequence alignment and fold recognition. The scoring function is a weighted sum of the traditional PAM and BLOSUM scoring matrices, position-specific scoring matrices output by PSI-BLAST, secondary structure predicted by a variety of methods, chemical properties, and gap penalties. By adjusting the weights, the method can be tailored for fold recognition or sequence alignment tasks at different levels of sequence identity. A Monte Carlo algorithm was used to determine optimized weight sets for sequence alignment and fold recognition that most accurately reproduced the SABmark reference alignment test set. In an evaluation of sequence alignment performance, BCL::Align ranked best in alignment accuracy (Cline score of 22.90 for sequences in the Twilight Zone) when compared with Align-m, ClustalW, T-Coffee, and MUSCLE. ROC curve analysis indicates BCL::Align's ability to correctly recognize protein folds with over 80% accuracy. The flexibility of the program allows it to be optimized for specific classes of proteins (e.g. membrane proteins) or fold families (e.g. TIM-barrel proteins). BCL::Align is free for academic use and available online at http://www.meilerlab.org/.  相似文献   

13.
MOTIVATION: Multiple sequence alignment is a fundamental task in bioinformatics. Current tools typically form an initial alignment by merging subalignments, and then polish this alignment by repeated splitting and merging of subalignments to obtain an improved final alignment. In general this form-and-polish strategy consists of several stages, and a profusion of methods have been tried at every stage. We carefully investigate: (1) how to utilize a new algorithm for aligning alignments that optimally solves the common subproblem of merging subalignments, and (2) what is the best choice of method for each stage to obtain the highest quality alignment. RESULTS: We study six stages in the form-and-polish strategy for multiple alignment: parameter choice, distance estimation, merge-tree construction, sequence-pair weighting, alignment merging, and polishing. For each stage, we consider novel approaches as well as standard ones. Interestingly, the greatest gains in alignment quality come from (i) estimating distances by a new approach using normalized alignment costs, and (ii) polishing by a new approach using 3-cuts. Experiments with a parameter-value oracle suggest large gains in quality may be possible through an input-dependent choice of alignment parameters, and we present a promising approach for building such an oracle. Combining the best approaches to each stage yields a new tool we call Opal that on benchmark alignments matches the quality of the top tools, without employing alignment consistency or hydrophobic gap penalties. AVAILABILITY: Opal, a multiple alignment tool that implements the best methods in our study, is freely available at http://opal.cs.arizona.edu.  相似文献   

14.
PipeAlign is a protein family analysis tool integrating a five step process ranging from the search for sequence homologues in protein and 3D structure databases to the definition of the hierarchical relationships within and between subfamilies. The complete, automatic pipeline takes a single sequence or a set of sequences as input and constructs a high-quality, validated MACS (multiple alignment of complete sequences) in which sequences are clustered into potential functional subgroups. For the more experienced user, the PipeAlign server also provides numerous options to run only a part of the analysis, with the possibility to modify the default parameters of each software module. For example, the user can choose to enter an existing multiple sequence alignment for refinement, validation and subsequent clustering of the sequences. The aim is to provide an interactive workbench for the validation, integration and presentation of a protein family, not only at the sequence level, but also at the structural and functional levels. PipeAlign is available at http://igbmc.u-strasbg.fr/PipeAlign/.  相似文献   

15.
Con-Struct Map is a graphical tool for the comparative study of protein structures. The tool detects potential conserved residue contacts shared by multiple protein structures by superimposing their contact maps according to a multiple structure alignment. In general, Con-Struct Map allows the study of structural changes resulting from, e.g. sequence substitutions, or alternatively, the study of conserved components of a structure framework across structurally aligned proteins. Specific applications include the study of sequence-structure relationship in distantly related proteins and the comparisons of wild type and mutant proteins. AVAILABILITY: http://pdbrs3.sdsc.edu/ConStructMap/viewer_argument_generator/singleArguments. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

16.
Biomolecule sequences and structures of land, air and water species are determined rapidly and the data entries are unevenly distributed for different organisms. It frequently leads to the BLAST results of homologous search containing undesirable entries from organisms living in different environments. To reduce irrelevant searching results, a separate database for comparative genomics is urgently required. A comprehensive bioinformatics tool set and an integrated database, named Bioinformatics tools for Marine and Freshwater Genomics (BiMFG), are constructed for comparative analyses among model species and underwater species. Novel matching techniques based on conserved motifs and/or secondary structure elements are designed for efficiently and effectively retrieving and aligning remote sequences through cross-species comparisons. It is especially helpful when sequences under analysis possess low similarities and unresolved structural information. In addition, the system provides core techniques of multiple sequence alignment, multiple second structure profile alignment and iteratively refined multiple structural alignments for biodiversity analysis and verification in marine and freshwater biology. The BiMFG web server is freely available for use at http://bimfg.cs.ntou.edu.tw/.  相似文献   

17.
We present a software tool CTX-BLAST that incorporates contextual alignment model into the popular protein BLAST program. Our alignment tool allows us to investigate the effect of context-dependency in the protein alignment much more efficient than using previous dynamic algorithms. The software makes use of non-symmetric contextual substitution tables and calculates the statistical significance of a given alignment according to the contextual statistical model. AVAILABILITY: CTX-BLAST is an open source software freely available from www.sourceforge.net/projects/CTX-BLAST. A program for statistical estimation of E-value parameters and the contextual substitution table CTX-BLOSUM62 are also provided. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

18.
Secondary structures of RNA sequences are increasingly being used as additional information in reconstructing phylogenies and/or in distinguishing species by compensatory base change (CBC) analyses. However, in most cases just one secondary structure is used in manually correcting an automatically generated multiple sequence alignment and/or just one secondary structure is used in guiding a sequence alignment still completely generated by hand. With the advent of databases and tools offering individual RNA secondary structures, here we re-introduce a twelve letter code already implemented in 4SALE – a tool for synchronous sequence and secondary structure alignment and editing – that enables one to align RNA sequences and their individual secondary structures synchronously and fully automatic, while dramatically increasing the phylogenetic information content. We further introduce a scaled down non-GUI version of 4SALE particularly designed for big data analysis, and available at: http://4sale.bioapps.biozentrum.uni-wuerzburg.de.  相似文献   

19.
We present a method, called BlockMatch, for aligning two blocks, where a block is an RNA multiple sequence alignment with the consensus secondary structure of the alignment in Stockholm format. The method employs a quadratic-time dynamic programming algorithm for aligning columns and column pairs of the multiple alignments in the blocks. Unlike many other tools that can perform pairwise alignment of either single sequences or structures only, BlockMatch takes into account the characteristics of all the sequences in the blocks along with their consensus structures during the alignment process, thus being able to achieve a high-quality alignment result. We apply BlockMatch to phylogeny reconstruction on a set of 5S rRNA sequences taken from fifteen bacteria species. Experimental results showed that the phylogenetic tree generated by our method is more accurate than the tree constructed based on the widely used ClustalW tool. The BlockMatch algorithm is implemented into a web server, accessible at http://bioinformatics.njit.edu/blockmatch. A jar file of the program is also available for download from the web server.  相似文献   

20.
MOTIVATION: Amino acid sequence alignments are widely used in the analysis of protein structure, function and evolutionary relationships. Proteins within a superfamily usually share the same fold and possess related functions. These structural and functional constraints are reflected in the alignment conservation patterns. Positions of functional and/or structural importance tend to be more conserved. Conserved positions are usually clustered in distinct motifs surrounded by sequence segments of low conservation. Poorly conserved regions might also arise from the imperfections in multiple alignment algorithms and thus indicate possible alignment errors. Quantification of conservation by attributing a conservation index to each aligned position makes motif detection more convenient. Mapping these conservation indices onto a protein spatial structure helps to visualize spatial conservation features of the molecule and to predict functionally and/or structurally important sites. Analysis of conservation indices could be a useful tool in detection of potentially misaligned regions and will aid in improvement of multiple alignments. RESULTS: We developed a program to calculate a conservation index at each position in a multiple sequence alignment using several methods. Namely, amino acid frequencies at each position are estimated and the conservation index is calculated from these frequencies. We utilize both unweighted frequencies and frequencies weighted using two different strategies. Three conceptually different approaches (entropy-based, variance-based and matrix score-based) are implemented in the algorithm to define the conservation index. Calculating conservation indices for 35522 positions in 284 alignments from SMART database we demonstrate that different methods result in highly correlated (correlation coefficient more than 0.85) conservation indices. Conservation indices show statistically significant correlation between sequentially adjacent positions i and i + j, where j < 13, and averaging of the indices over the window of three positions is optimal for motif detection. Positions with gaps display substantially lower conservation properties. We compare conservation properties of the SMART alignments or FSSP structural alignments to those of the ClustalW alignments. The results suggest that conservation indices should be a valuable tool of alignment quality assessment and might be used as an objective function for refinement of multiple alignments. AVAILABILITY: The C code of the AL2CO program and its pre-compiled versions for several platforms as well as the details of the analysis are freely available at ftp://iole.swmed.edu/pub/al2co/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号