期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Segment-based multiple sequence alignment

Rausch T Emde AK Weese D Döring A Notredame C Reinert K 《Bioinformatics (Oxford, England)》2008,24(16):i187-i192

相似文献

2.

Strategies for multiple sequence alignment

Nicholas HB Ropelewski AJ Deerfield DW 《BioTechniques》2002,32(3):572-4, 576, 578 passim

We present an overview of multiple sequence alignments to outline the practical consequences for the choices among different techniques and parameters. We begin with a discussion of the scoring methods for quantifying the quality of a multiple sequence alignment, followed by a discussion of the algorithms implemented within a variety of multiple sequence alignment programs. We also discuss additional alignment details such as gap penalty and distance metrics. The paper concludes with a discussion on how to improve alignment quality and the limitations of the techniques described in this paper 相似文献

3.

Getting the sequence world: How to use multiple alignment software

Ikeo K 《Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme》2001,46(9):1299-1305

相似文献

4.

基于Progressive多序列比对方法的求解多序列比对的启发式算法

张津郭茂祖王亚东《生物信息学》2005,3(4):171-174

在生物信息学研究中,生物序列比对问题占有重要的地位。多序列比对问题是一个NPC问题,由于时间和空间的限制不能够求出精确解。文中简要介绍了Feng和Doolittle提出的多序列比对算法的基本思想,并改进了该算法使之具有更好的比对精度。实验结果表明,新算法对解决一般的progressive多序列比对方法中遇到的局部最优问题有较好的效果。相似文献

5.

Grammar-based distance in progressive multiple sequence alignment

David J Russell Hasan H Otu Khalid Sayood 《BMC bioinformatics》2008,9(1):306

Background

We propose a multiple sequence alignment (MSA) algorithm and compare the alignment-quality and execution-time of the proposed algorithm with that of existing algorithms. The proposed progressive alignment algorithm uses a grammar-based distance metric to determine the order in which biological sequences are to be pairwise aligned. The progressive alignment occurs via pairwise aligning new sequences with an ensemble of the sequences previously aligned. 相似文献

6.

Gap costs for multiple sequence alignment 总被引：6，自引：0，他引：6

S F Altschul 《Journal of theoretical biology》1989,138(3):297-309

Standard methods for aligning pairs of biological sequences charge for the most common mutations, which are substitutions, deletions and insertions. Because a single mutation may insert or delete several nucleotides, gap costs that are not directly proportional to gap length are usually the most effective. How to extend such gap costs to alignments of three or more sequences is not immediately obvious, and a variety of approaches have been taken. This paper argues that, since gap and substitution costs together specify optimal alignments, they should be defined using a common rationale. Specifically, a new definition of gap costs for multiple alignments is proposed and compared with previous ones. Since the new definition links a multiple alignment's cost to that of its pairwise projections, it allows knowledge gained about two-sequence alignments to bear on the multiple alignment problem. Also, such linkage is a key element of recent algorithms that have rendered practical the simultaneous alignment of as many as six sequences. 相似文献

7.

A flexible multiple sequence alignment program 总被引：12，自引：3，他引：12

下载免费PDF全文

H M Martinez 《Nucleic acids research》1988,16(5):1683-1691

The 'regions' method for multisequence alignment used in the previously reported program MALIGN has been generalized to include recursive refinement so that unaligned portions between two regions at the current level of resolution can be handled with increased resolution. Additionally, there is incorporated a limiting of the number of regions to be used at any level of resolution from which to abstract an alignment. This provides a significant increase in speed over the unlimited version. The program GENALIGN uses this improved regions method to execute fast pairwise alignments in the framework of Taylor's multisequence alignment procedure using clustered pairwise alignments. Pairwise alignments by dynamic programming are also provided in the program. 相似文献

8.

Heuristics for multiobjective multiple sequence alignment

Maryam Abbasi Luís Paquete Francisco B. Pereira 《Biomedical engineering online》2016,15(1):70

Background

Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment.

Methods

We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments.

Results and conclusions

The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show that our approaches can obtain better results than TCoffee and Clustal Omega in terms of the first ratio.

相似文献

9.

A multiple sequence alignment program. 总被引：16，自引：7，他引：16

下载免费PDF全文

E Sobel H M Martinez 《Nucleic acids research》1986,14(1):363-374

A program is described for simultaneously aligning two or more molecular sequences which is based on first finding common segments above a specified length and then piecing these together to maximize an alignment scoring function. Optimal as well as near-optimal alignments are found, and there is also provided a means for randomizing the given sequences for testing the statistical significance of an alignment. Alignments may be made in the original alphabets of the sequences or in user-specified alternate ones to take advantage of chemical similarities (such as hydrophobic-hydrophilic). 相似文献

10.

Recent evolutions of multiple sequence alignment algorithms 总被引：1，自引：0，他引：1

下载免费PDF全文

Notredame C 《PLoS computational biology》2007,3(8):e123

相似文献

11.

Local multiple sequence alignment using dead-end elimination 总被引：2，自引：0，他引：2

Lukashin AV Rosa JJ 《Bioinformatics (Oxford, England)》1999,15(11):947-953

MOTIVATION: Local multiple sequence alignment is a basic tool for extracting functionally important regions shared by a family of protein sequences. We present an effectively polynomial-time algorithm for rigorously solving the local multiple alignment problem. RESULTS: The algorithm is based on the dead-end elimination procedure that makes it possible to avoid an exhaustive search. In the framework of the sum-of-pairs scoring system, certain rejection criteria are derived in order to eliminate those sequence segments and segment pairs that can be mathematically shown to be inconsistent (dead-ending) with the globally optimal alignment. Iterative application of the elimination criteria results in a rapid reduction of combinatorial possibilities without considering them explicitly. In the vast majority of cases, the procedure converges to a unique globally optimal solution. In contrast to the exhaustive search, whose computational complexity is combinatorial, the algorithm is computationally feasible because the number of operations required to eliminate the dead-ending segments and segment pairs grows quadratically and cubically, respectively, with the total number of sequence elements. The method is illustrated on a set of protein families for which the globally optimal alignments are well recognized. AVAILABILITY: The source code of the program implementing the algorithm is available upon request from the authors. CONTACT: alex_lukashin@biogen.com. 相似文献

12.

MALIGNED: a multiple sequence alignment editor 总被引：3，自引：0，他引：3

Clark Stephen P. 《Bioinformatics (Oxford, England)》1992,8(6):535-538

A multiple sequence alignment editor is described which runson a VAX/VMS system and can exchange data with a number of otherprograms, including those of the Genetics Computer Group (GCG).Up to 199 sequences can be aligned. The quality of the alignmentcan be easily judged during its development because the displayattributes to each character are determined by the way it matchesthe other sequences. Four methods are available for calculatingthe highlighting to emphasize different aspects of the relationshipsof the sequences and up to four styles of highlighting can beused at the same time. Laser printer output is suitable forpublication without modification. 相似文献

13.

Reticular alignment: A progressive corner-cutting method for multiple sequence alignment

Adrienn Szabó Ádám Novák István Miklós Jotun Hein 《BMC bioinformatics》2010,11(1):570

Background

In this paper, we introduce a progressive corner cutting method called Reticular Alignment for multiple sequence alignment. Unlike previous corner-cutting methods, our approach does not define a compact part of the dynamic programming table. Instead, it defines a set of optimal and suboptimal alignments at each step during the progressive alignment. The set of alignments are represented with a network to store them and use them during the progressive alignment in an efficient way. The program contains a threshold parameter on which the size of the network depends. The larger the threshold parameter and thus the network, the deeper the search in the alignment space for better scored alignments. 相似文献

14.

The practical use of the A* algorithm for exact multiple sequence alignment.

M Lermen K Reinert 《Journal of computational biology》2000,7(5):655-671

Multiple alignment is an important problem in computational biology. It is well known that it can be solved exactly by a dynamic programming algorithm which in turn can be interpreted as a shortest path computation in a directed acyclic graph. The A* algorithm (or goal-directed unidirectional search) is a technique that speeds up the computation of a shortest path by transforming the edge lengths without losing the optimality of the shortest path. We implemented the A* algorithm in a computer program similar to MSA (Gupta et al., 1995) and FMA (Shibuya and Imai, 1997). We incorporated in this program new bounding strategies for both lower and upper bounds and show that the A* algorithm, together with our improvements, can speed up computations considerably. Additionally, we show that the A* algorithm together with a standard bounding technique is superior to the well-known Carrillo-Lipman bounding since it excludes more nodes from consideration. 相似文献

15.

Characterization of pairwise and multiple sequence alignment errors

Landan G Graur D 《Gene》2009,441(1-2):141-147

We characterize pairwise and multiple sequence alignment (MSA) errors by comparing true alignments from simulations of sequence evolution with reconstructed alignments. The vast majority of reconstructed alignments contain many errors. Error rates rapidly increase with sequence divergence, thus, for even intermediate degrees of sequence divergence, more than half of the columns of a reconstructed alignment may be expected to be erroneous. In closely related sequences, most errors consist of the erroneous positioning of a single indel event and their effect is local. As sequences diverge, errors become more complex as a result of the simultaneous mis-reconstruction of many indel events, and the lengths of the affected MSA segments increase dramatically. We found a systematic bias towards underestimation of the number of gaps, which leads to the reconstructed MSA being on average shorter than the true one. Alignment errors are unavoidable even when the evolutionary parameters are known in advance. Correct reconstruction can only be guaranteed when the likelihood of true alignment is uniquely optimal. However, true alignment features are very frequently sub-optimal or co-optimal, with the result that optimal albeit erroneous features are incorporated into the reconstructed MSA. Progressive MSA utilizes a guide-tree in the reconstruction of MSAs. The quality of the guide-tree was found to affect MSA error levels only marginally. 相似文献

16.

Recent developments in the MAFFT multiple sequence alignment program 总被引：4，自引：0，他引：4

Katoh K Toh H 《Briefings in bioinformatics》2008,9(4):286-298

The accuracy and scalability of multiple sequence alignment (MSA) of DNAs and proteins have long been and are still important issues in bioinformatics. To rapidly construct a reasonable MSA, we developed the initial version of the MAFFT program in 2002. MSA software is now facing greater challenges in both scalability and accuracy than those of 5 years ago. As increasing amounts of sequence data are being generated by large-scale sequencing projects, scalability is now critical in many situations. The requirement of accuracy has also entered a new stage since the discovery of functional noncoding RNAs (ncRNAs); the secondary structure should be considered for constructing a high-quality alignment of distantly related ncRNAs. To deal with these problems, in 2007, we updated MAFFT to Version 6 with two new techniques: the PartTree algorithm and the Four-way consistency objective function. The former improved the scalability of progressive alignment and the latter improved the accuracy of ncRNA alignment. We review these and other techniques that MAFFT uses and suggest possible future directions of MSA software as a basis of comparative analyses. MAFFT is available at http://align.bmr.kyushu-u.ac.jp/mafft/software/. 相似文献

17.

Sequence alignment of citrate synthase proteins using a multiple sequence alignment algorithm and multiple scoring matrices 总被引：1，自引：0，他引：1

C M Henneke M J Danson D W Hough D J Osguthorpe 《Protein engineering》1989,2(8):597-604

The alignment of Escherichia coli citrate synthase to pig heart citrate synthase and the multiple alignment of the known sequences of the citrate synthase family of enzymes have been performed using six different amino acid similarity scoring matrices and a large range of gap penalty ratios for insertions and deletions of amino acids. The alignment studies have been performed as the first step in a project aimed at homology modelling E. coli citrate synthase (a hexamer) from pig heart citrate synthase (a dimer) in a molecular modelling approach to the study of multi-subunit enzymes. The effects of several important variables in producing realistic alignments have been investigated. The difference between multiple alignment of the family of enzymes versus simple pairwise alignment of the pig heart and E. coli proteins was explored. The effects of initial separate multiple alignments of the most highly related or most homologous species of the family of enzymes upon a subsequent pairwise alignment between species was evaluated. The value of 'fingerprinting' certain residues to bias the alignment in favour of matching those residues, as well as the worth of the computerized approach compared to an intuitive alignment technique, were assessed. 相似文献

18.

Protein multiple sequence alignment by hybrid bio-inspired algorithms

Cutello V Nicosia G Pavone M Prizzi I 《Nucleic acids research》2011,39(6):1980-1992

This article presents an immune inspired algorithm to tackle the Multiple Sequence Alignment (MSA) problem. MSA is one of the most important tasks in biological sequence analysis. Although this paper focuses on protein alignments, most of the discussion and methodology may also be applied to DNA alignments. The problem of finding the multiple alignment was investigated in the study by Bonizzoni and Vedova and Wang and Jiang, and proved to be a NP-hard (non-deterministic polynomial-time hard) problem. The presented algorithm, called Immunological Multiple Sequence Alignment Algorithm (IMSA), incorporates two new strategies to create the initial population and specific ad hoc mutation operators. It is based on the 'weighted sum of pairs' as objective function, to evaluate a given candidate alignment. IMSA was tested using both classical benchmarks of BAliBASE (versions 1.0, 2.0 and 3.0), and experimental results indicate that it is comparable with state-of-the-art multiple alignment algorithms, in terms of quality of alignments, weighted Sums-of-Pairs (SP) and Column Score (CS) values. The main novelty of IMSA is its ability to generate more than a single suboptimal alignment, for every MSA instance; this behaviour is due to the stochastic nature of the algorithm and of the populations evolved during the convergence process. This feature will help the decision maker to assess and select a biologically relevant multiple sequence alignment. Finally, the designed algorithm can be used as a local search procedure to properly explore promising alignments of the search space. 相似文献

19.

A fast and sensitive multiple sequence alignment algorithm 总被引：4，自引：0，他引：4

Vingron Martin; Argos Patrick 《Bioinformatics (Oxford, England)》1989,5(2):115-121

A two-step multiple alignment strategy is presented that allowsrapid alignment of a set of homologous sequences and comparisonof pre-aligned groups of sequences. Examples are given demonstratingthe improvement in the quality of alignments when comparingentire groups instead of single sequences. The modular designof computer programs based on this algorithm allows for storageof aligned sequences and successive alignment of any numberof sequences. Received on August 23, 1988; accepted on December 6, 1988 相似文献

20.

Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost

Shinsuke Yamada Osamu Gotoh Hayato Yamana 《BMC bioinformatics》2006,7(1):524-17

Background

Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. In the alignment of a family of protein sequences, global MSA algorithms perform better than local ones in many cases, while local ones perform better than global ones when some sequences have long insertions or deletions (indels) relative to others. Many recent leading MSA algorithms have incorporated pairwise alignment information obtained from a mixture of sources into their scoring system to improve accuracy of alignment containing long indels. 相似文献