期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The many faces of sequence alignment 总被引：9，自引：0，他引：9

Batzoglou S 《Briefings in bioinformatics》2005,6(1):6-22

Starting with the sequencing of the mouse genome in 2002, we have entered a period where the main focus of genomics will be to compare multiple genomes in order to learn about human biology and evolution at the DNA level. Alignment methods are the main computational component of this endeavour. This short review aims to summarise the current status of research in alignments, emphasising large-scale genomic comparisons and suggesting possible directions that will be explored in the near future. 相似文献

2.

A workbench for multiple alignment construction and analysis 总被引：126，自引：0，他引：126

G D Schuler S F Altschul D J Lipman 《Proteins》1991,9(3):180-190

Multiple sequence alignment can be a useful technique for studying molecular evolution, as well as for analyzing relationships between structure or function and primary sequence. We have developed for this purpose an interactive program, MACAW (Multiple Alignment Construction and Analysis Workbench), that allows the user to construct multiple alignments by locating, analyzing, editing, and combining "blocks" of aligned sequence segments. MACAW incorporates several novel features. (1) Regions of local similarity are located by a new search algorithm that avoids many of the limitations of previous techniques. (2) The statistical significance of blocks of similarity is evaluated using a recently developed mathematical theory. (3) Candidate blocks may be evaluated for potential inclusion in a multiple alignment using a variety of visualization tools. (4) A user interface permits each block to be edited by moving its boundaries or by eliminating particular segments, and blocks may be linked to form a composite multiple alignment. No completely automatic program is likely to deal effectively with all the complexities of the multiple alignment problem; by combining a powerful similarity search algorithm with flexible editing, analysis and display tools, MACAW allows the alignment strategy to be tailored to the problem at hand. 相似文献

3.

A "Long Indel" model for evolutionary sequence alignment 总被引：7，自引：0，他引：7

Miklós I Lunter GA Holmes I 《Molecular biology and evolution》2004,21(3):529-540

We present a new probabilistic model of sequence evolution, allowing indels of arbitrary length, and give sequence alignment algorithms for our model. Previously implemented evolutionary models have allowed (at most) single-residue indels or have introduced artifacts such as the existence of indivisible "fragments." We compare our algorithm to these previous methods by applying it to the structural homology dataset HOMSTRAD, evaluating the accuracy of (1) alignments and (2) evolutionary time estimates. With our method, it is possible (for the first time) to integrate probabilistic sequence alignment, with reliability indicators and arbitrary gap penalties, in the same framework as phylogenetic reconstruction. Our alignment algorithm requires that we evaluate the likelihood of any specific path of mutation events in a continuous-time Markov model, with the event times integrated out. To this effect, we introduce a "trajectory likelihood" algorithm (Appendix A). We anticipate that this algorithm will be useful in more general contexts, such as Markov Chain Monte Carlo simulations. 相似文献

4.

fingerprint: visual depiction of variation in multiple sequence alignments

MELANIE LOU G. BRIAN GOLDING 《Molecular ecology resources》2007,7(6):908-914

There is a lack of programs available that focus on providing an overview of an aligned set of sequences such that the comparison of homologous sites becomes comprehensible and intuitive. Being able to identify similarities, differences, and patterns within a multiple sequence alignment is biologically valuable because it permits visualization of the distribution of a particular feature and inferences about the structure, function, and evolution of the sequences in question. We have therefore created a web server, fingerprint, which combines the characteristics of existing programs that represent identity, variability, charge, hydrophobicity, solvent accessibility, and structure along with new visualizations based on composition, heterogeneity, heterozygosity, d_N/d_S and nucleotide diversity. fingerprint is easy to use and globally accessible through any computer using any major browser. fingerprint is available at http://evol.mcmaster.ca/fingerprint/ . 相似文献

5.

Clustal W—蛋白质与核酸序列分析软件 总被引：2，自引：1，他引：2

郭崇志孙曼霁《生物技术通讯》2000,11(2):146-149

蛋白质与核酸的序列分析在现代生物学和生物信息学中发挥着重要作用,新的算法和软件层出不穷,本文介绍一个可运行在ＰＣ机上的完全免费的多序列比较软件－ＣｌｕｓｔａｌＷ,它不但可以进行蛋白质与核酸的多序列比较,分析不同序列之间的相似性关系,还可以绘制进化树。由于其灵活的输入输出格式、方便的参数设定和选择、详尽的在线帮助以及良好的可移植性,使得ＣｌｕｓｔａｌＷ在蛋白质与核酸的序列分析中得到了广泛应用。相似文献

6.

Comparison of diverse protein sequences of the nuclear-encoded subunits of cytochrome C oxidase suggests conservation of structure underlies evolving functional sites

Das J Miller ST Stern DL 《Molecular biology and evolution》2004,21(8):1572-1582

Interspecific comparisons of protein sequences can reveal regions of evolutionary conservation that are under purifying selection because of functional constraints. Interpreting these constraints requires combining evolutionary information with structural, biochemical, and physiological data to understand the biological function of conserved regions. We take this integrative approach to investigate the evolution and function of the nuclear-encoded subunits of cytochrome c oxidase (COX). We find that the nuclear-encoded subunits evolved subsequent to the origin of mitochondria and the subunit composition of the holoenzyme varies across diverse taxa that include animals, yeasts, and plants. By mapping conserved amino acids onto the crystal structure of bovine COX, we show that conserved residues are structurally organized into functional domains. These domains correspond to some known functional sites as well as to other uncharacterized regions. We find that amino acids that are important for structural stability are conserved at frequencies higher than expected within each taxon, and groups of conserved residues cluster together at distances of less than 5 A more frequently than do randomly selected residues. We, therefore, suggest that selection is acting to maintain the structural foundation of COX across taxa, whereas active sites vary or coevolve within lineages. 相似文献

7.

An evolutionary model for maximum likelihood alignment of DNA sequences 总被引：16，自引：0，他引：16

Jeffrey L. Thorne Hirohisa Kishino Joseph Felsenstein 《Journal of molecular evolution》1991,33(2):114-124

Summary Most algorithms for the alignment of biological sequences are not derived from an evolutionary model. Consequently, these alignment algorithms lack a strong statistical basis. A maximum likelihood method for the alignment of two DNA sequences is presented. This method is based upon a statistical model of DNA sequence evolution for which we have obtained explicit transition probabilities. The evolutionary model can also be used as the basis of procedures that estimate the evolutionary parameters relevant to a pair of unaligned DNA sequences. A parameter-estimation approach which takes into account all possible alignments between two sequences is introduced; the danger of estimating evolutionary parameters from a single alignment is discussed. 相似文献

8.

MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment 总被引：431，自引：0，他引：431

Kumar S Tamura K Nei M 《Briefings in bioinformatics》2004,5(2):150-163

With its theoretical basis firmly established in molecular evolutionary and population genetics, the comparative DNA and protein sequence analysis plays a central role in reconstructing the evolutionary histories of species and multigene families, estimating rates of molecular evolution, and inferring the nature and extent of selective forces shaping the evolution of genes and genomes. The scope of these investigations has now expanded greatly owing to the development of high-throughput sequencing techniques and novel statistical and computational methods. These methods require easy-to-use computer programs. One such effort has been to produce Molecular Evolutionary Genetics Analysis (MEGA) software, with its focus on facilitating the exploration and analysis of the DNA and protein sequence variation from an evolutionary perspective. Currently in its third major release, MEGA3 contains facilities for automatic and manual sequence alignment, web-based mining of databases, inference of the phylogenetic trees, estimation of evolutionary distances and testing evolutionary hypotheses. This paper provides an overview of the statistical methods, computational tools, and visual exploration modules for data input and the results obtainable in MEGA. 相似文献

9.

cDNA and deduced amino acid sequences of cytochrome c fromChlamydomonas reinhardtii: Unexpected functional and phylogenetic implications

Bruno B. Amati Michel Goldschmidt-Clermont Carmichael J. A. Wallace Jean-David Rochaix 《Journal of molecular evolution》1988,28(1-2):151-160

Summary We have isolated complementary DNA (cDNA) clones for apocytochrome c from the green algaChlamydomonas reinhardtii and shown that they are encoded by a single nuclear gene termedcyc.Cyc mRNA levels are found to depend primarily on the presence of acetate as a reduced carbon source in the culture medium. The deduced amino acid sequence shows that, apart from the probable removal of the initiating methionine,C. reinhardtii apocytochrome c is syntheszed in its mature form. Its structure is generally similar to that of cytochromes c from higher plants. Several punctual deviations from the general pattern of cytochrome c sequences that is found in other organisms have interesting structural and functional implications. These include, in particular, valines 19 and 39, asparagine 78, and alanine 83. A phylogenetic tree was constructed by the matrix method from cytochrome c data for a representative range of species. The results suggest thatC. reinhardtii diverged from higher plants approximately 700–750 million years ago; they also are not easy to reconcile with the current attribution ofChlamydomonas reinhardtii andEnteromorpha intestinalis to a unique phylum, because these two species probably diverged from one another at about the same time as they diverged from the line leading to higher plants. 相似文献

10.

一个新的核酸序列比对算法及其在序列全局比对中的应用 总被引：1，自引：0，他引：1

李静张宏薛毅耿美英张成岗《生物信息学》2003,1(1):37-41

目前在序列比对中所广泛使用的动态规划算法,虽然能达到最优比对结果,但却由于具有高计算复杂度O(N_2)而极大地降低了计算效率。将多阶段动态规划决策算法用于两两序列比对并用Visual BASIC编程实现,结果发现该新算法在将计算复杂度减小到O(N)的同时,也能够获得较为理想的计算精度,预期将在序列全局比对中起重要作用。相似文献

11.

Qian B Goldstein RA 《Proteins》2002,48(4):605-610

The accuracy of the alignments of protein sequences depends on the score matrix and gap penalties used in performing the alignment. Most score functions are designed to find homologs in the various databases rather than to generate accurate alignments between known homologs. We describe the optimization of a score function for the purpose of generating accurate alignments, as evaluated by using a coordinate root-mean-square deviation (RMSD)-based merit function. We show that the resulting score matrix, which we call STROMA, generates more accurate alignments than other commonly used score matrices, and this difference is not due to differences in the gap penalties. In fact, in contrast to most of the other matrices, the alignment accuracies with STROMA are relatively insensitive to the choice of gap penalty parameters. 相似文献

12.

Information on the secondary structure improves the quality of protein sequence alignment

I. I. Litvinov M. Yu. Lobanov A. A. Mironov A. V. Finkelshtein M. A. Roytberg 《Molecular Biology》2006,40(3):474-480

The most popular algorithms employed in the pairwise alignment of protein primary structures (Smith-Watermann (SW) algorithm, FASTA, BLAST, etc.) only analyze the amino acid sequence. The SW algorithm is the most accurate, yielding alignments that agree best with superimpositions of the corresponding spatial structures of proteins. However, even the SW algorithm fails to reproduce the spatial structure alignment when the sequence identity is lower than 30%. The objective of this work was to develop a new and more accurate algorithm taking the secondary structure of proteins into account. The alignments generated by this algorithm and having the maximal weight with the secondary structure considered proved to be more accurate than SW alignments. With sequences having less than 30% identity, the accuracy (i.e., the portion of reproduced positions of a reference alignment obtained by superimposing the protein spatial structures) of the new algorithm is 58 vs. 35% of the SW algorithm. The accuracy of the new algorithm is much the same with secondary structures established experimentally or predicted theoretically. Hence, the algorithm is applicable to proteins with unknown spatial structures. The program is available at ftp://194.149.64.196/STRUSWER/. 相似文献

13.

Marmot phylogeny revisited: molecular evidence for a diphyletic origin of sociality 总被引：4，自引：0，他引：4

L. Kruckenhauser W. Pinsker E. Haring W. Arnold 《Journal of Zoological Systematics and Evolutionary Research》1999,37(1):49-56

We established the phylogeny of 11 species of the genus Marmota based on the entire sequence of the mitochondrial cytochrome b ( cyt-b ) gene (1.1 kb) and a partial sequence of the NADH dehydrogenase subunit 4 ( ND4 ) gene (1.2 kb). In three species ( Marmota caligata , Marmota olympus , and Marmota bobac ) full-sized nuclear pseudogenes of the mitochondrial cyt-b were identified. The mitochondrial cyt-b genes and the three pseudogenes form separate clusters in the maximum parsimony dendrogram. This finding suggests that the pseudogenes originated from a single transfer to the nucleus that may have occurred prior to the radiation of the genus Marmota . Notably, compared with their functional mitochondrial equivalents the pseudogenes show a much lower substitution rate. In the dendrograms deduced from the mitochondrial sequences two distinct clusters become apparent: one cluster consists of the North-west American species, the other contains the Eurasian species together with the North American species Marmota monax . The position of M. monax as a member of the Eurasian clade is in accordance with the evolution of chromosome numbers. The results are of special interest with respect to the evolution of social systems in the genus that vary from solitary species ( M. monax ) to highly social species living in family groups (e.g. Marmota marmota ). The molecular phylogeny suggests a diphyletic origin of high sociality in the genus Marmota . 相似文献

14.

Topological characterization of neuronal arbor morphology via sequence representation: II - global alignment

Todd A Gillette Parsa Hosseini Giorgio A Ascoli 《BMC bioinformatics》2015,16(1)

Background

The increasing abundance of neuromorphological data provides both the opportunity and the challenge to compare massive numbers of neurons from a wide diversity of sources efficiently and effectively. We implemented a modified global alignment algorithm representing axonal and dendritic bifurcations as strings of characters. Sequence alignment quantifies neuronal similarity by identifying branch-level correspondences between trees.

Results

The space generated from pairwise similarities is capable of classifying neuronal arbor types as well as, or better than, traditional topological metrics. Unsupervised cluster analysis produces groups that significantly correspond with known cell classes for axons, dendrites, and pyramidal apical dendrites. Furthermore, the distinguishing consensus topology generated by multiple sequence alignment of a group of neurons reveals their shared branching blueprint. Interestingly, the axons of dendritic-targeting interneurons in the rodent cortex associates with pyramidal axons but apart from the (more topologically symmetric) axons of perisomatic-targeting interneurons.

Conclusions

Global pairwise and multiple sequence alignment of neurite topologies enables detailed comparison of neurites and identification of conserved topological features in alignment-defined clusters. The methods presented also provide a framework for incorporation of additional branch-level morphological features. Moreover, comparison of multiple alignment with motif analysis shows that the two techniques provide complementary information respectively revealing global and local features.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0605-1) contains supplementary material, which is available to authorized users. 相似文献

15.

Quality assessment of multiple alignment programs 总被引：7，自引：0，他引：7

Lassmann T Sonnhammer EL 《FEBS letters》2002,529(1):126-130

A renewed interest in the multiple sequence alignment problem has given rise to several new algorithms. In contrast to traditional progressive methods, computationally expensive score optimization strategies are now predominantly employed. We systematically tested four methods (Poa, Dialign, T-Coffee and ClustalW) for the speed and quality of their alignments. As test sequences we used structurally derived alignments from BAliBASE and synthetic alignments generated by Rose. The tests included alignments of variable numbers of domains embedded in random spacer sequences. Overall, Dialign was the most accurate in cases with low sequence identity, while T-Coffee won in cases with high sequence identity. The fast Poa algorithm was almost as accurate, while ClustalW could compete only in strictly global cases with high sequence similarity. 相似文献

16.

M. Gerstein M. Levitt 《Protein science : a publication of the Protein Society》1998,7(2):445-456

We apply a simple method for aligning protein sequences on the basis of a 3D structure, on a large scale, to the proteins in the scop classification of fold families. This allows us to assess, understand, and improve our automatic method against an objective, manually derived standard, a type of comprehensive evaluation that has not yet been possible for other structural alignment algorithms. Our basic approach directly matches the backbones of two structures, using repeated cycles of dynamic programming and least-squares fitting to determine an alignment minimizing coordinate difference. Because of simplicity, our method can be readily modified to take into account additional features of protein structure such as the orientation of side chains or the location-dependent cost of opening a gap. Our basic method, augmented by such modifications, can find reasonable alignments for all but 1.5% of the known structural similarities in scop, i.e., all but 32 of the 2,107 superfamily pairs. We discuss the specific protein structural features that make these 32 pairs so difficult to align and show how our procedure effectively partitions the relationships in scop into different categories, depending on what aspects of protein structure are involved (e.g., depending on whether or not consideration of side-chain orientation is necessary for proper alignment). We also show how our pairwise alignment procedure can be extended to generate a multiple alignment for a group of related structures. We have compared these alignments in detail with corresponding manual ones culled from the literature. We find good agreement (to within 95% for the core regions), and detailed comparison highlights how particular protein structural features (such as certain strands) are problematical to align, giving somewhat ambiguous results. With these improvements and systematic tests, our procedure should be useful for the development of scop and the future classification of protein folds. 相似文献

17.

Multiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels.

R B Russell G J Barton 《Proteins》1992,14(2):309-323

An algorithm is presented for the accurate and rapid generation of multiple protein sequence alignments from tertiary structure comparisons. A preliminary multiple sequence alignment is performed using sequence information, which then determines an initial superposition of the structures. A structure comparison algorithm is applied to all pairs of proteins in the superimposed set and a similarity tree calculated. Multiple sequence alignments are then generated by following the tree from the branches to the root. At each branchpoint of the tree, a structure-based sequence alignment and coordinate transformations are output, with the multiple alignment of all structures output at the root. The algorithm encoded in STAMP (STructural Alignment of Multiple Proteins) is shown to give alignments in good agreement with published structural accounts within the dehydrogenase fold domains, globins, and serine proteinases. In order to reduce the need for visual verification, two similarity indices are introduced to determine the quality of each generated structural alignment. Sc quantifies the global structural similarity between pairs or groups of proteins, whereas Pij' provides a normalized measure of the confidence in the alignment of each residue. STAMP alignments have the quality of each alignment characterized by Sc and Pij' values and thus provide a reproducible resource for studies of residue conservation within structural motifs. 相似文献

18.

Induced alignment and measurement of dipolar couplings of an SH2 domain through direct binding with filamentous phage

Dahlke Ojennus D Mitton-Fry RM Wuttke DS 《Journal of biomolecular NMR》1999,14(2):175-179

Large residual ¹⁵N-¹H dipolar couplings have been measured in a Src homology II domain aligned at Pf1 bacteriophage concentrations an order of magnitude lower than used for induction of a similar degree of alignment of nucleic acids and highly acidic proteins. An increase in ¹ H and ¹⁵N protein linewidths and a decrease in T₂ and T₁ relaxation time constants implicates a binding interaction between the protein and phage as the mechanism of alignment. However, the associated increased linewidth does not preclude the accurate measurement of large dipolar couplings in the aligned protein. A good correlation is observed between measured dipolar couplings and predicted values based on the high resolution NMR structure of the SH2 domain. The observation of binding-induced protein alignment promises to broaden the scope of alignment techniques by extending their applicability to proteins that are able to interact weakly with the alignment medium. 相似文献

19.

BCL::Align—Sequence alignment and fold recognition with a custom scoring function online

Elizabeth Dong Jarrod Smith Sten Heinze Nathan Alexander Jens Meiler 《Gene》2008,422(1-2):41

BCL::Align is a multiple sequence alignment tool that utilizes the dynamic programming method in combination with a customizable scoring function for sequence alignment and fold recognition. The scoring function is a weighted sum of the traditional PAM and BLOSUM scoring matrices, position-specific scoring matrices output by PSI-BLAST, secondary structure predicted by a variety of methods, chemical properties, and gap penalties. By adjusting the weights, the method can be tailored for fold recognition or sequence alignment tasks at different levels of sequence identity. A Monte Carlo algorithm was used to determine optimized weight sets for sequence alignment and fold recognition that most accurately reproduced the SABmark reference alignment test set. In an evaluation of sequence alignment performance, BCL::Align ranked best in alignment accuracy (Cline score of 22.90 for sequences in the Twilight Zone) when compared with Align-m, ClustalW, T-Coffee, and MUSCLE. ROC curve analysis indicates BCL::Align's ability to correctly recognize protein folds with over 80% accuracy. The flexibility of the program allows it to be optimized for specific classes of proteins (e.g. membrane proteins) or fold families (e.g. TIM-barrel proteins). BCL::Align is free for academic use and available online at http://www.meilerlab.org/. 相似文献

20.

Molecular Systematics and Phylogeny of the Reduncini (Artiodactyla: Bovidae) Inferred from the Analysis of Mitochondrial Cytochrome b Gene Sequences 总被引：6，自引：0，他引：6

J. Birungi P. Arctander 《Journal of Mammalian Evolution》2001,8(2):125-147

相似文献