期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

MACSIMS : multiple alignment of complete sequences information management system

Julie D Thompson Arnaud Muller Andrew Waterhouse Jim Procter Geoffrey J Barton Frédéric Plewniak Olivier Poch 《BMC bioinformatics》2006,7(1):318-13

相似文献

2.

Generating consensus sequences from partial order multiple sequence alignment graphs

Lee C 《Bioinformatics (Oxford, England)》2003,19(8):999-1008

MOTIVATION: Consensus sequence generation is important in many kinds of sequence analysis ranging from sequence assembly to profile-based iterative search methods. However, how can a consensus be constructed when its inherent assumption-that the aligned sequences form a single linear consensus-is not true? RESULTS: Partial Order Alignment (POA) enables construction and analysis of multiple sequence alignments as directed acyclic graphs containing complex branching structure. Here we present a dynamic programming algorithm (heaviest_bundle) for generating multiple consensus sequences from such complex alignments. The number and relationships of these consensus sequences reveals the degree of structural complexity of the source alignment. This is a powerful and general approach for analyzing and visualizing complex alignment structures, and can be applied to any alignment. We illustrate its value for analyzing expressed sequence alignments to detect alternative splicing, reconstruct full length mRNA isoform sequences from EST fragments, and separate paralog mixtures that can cause incorrect SNP predictions. AVAILABILITY: The heaviest_bundle source code is available at http://www.bioinformatics.ucla.edu/poa 相似文献

3.

A multiple alignment program for protein sequences 总被引：1，自引：0，他引：1

Santibanez M.; Rohde K. 《Bioinformatics (Oxford, England)》1987,3(2):111-114

A program for the multiple alignment of protein sequences ispresented. The program is an extension of the fast alignmentprogram by Wilbur et al. (1984) into higher dimensions. Theuse of hash procedures on fragments of the protein sequencesincreases the speed of calculation. Thereby we also take intoaccount fragments which are present in some, but not in all,sequences considered. The results of some multiple alignmentsare given. Received on September 11, 1986; accepted on March 18, 1987 相似文献

4.

On the information content of protein sequences

Rackovsky S Scheraga HA 《Journal of biomolecular structure & dynamics》2011,28(4):593-4; discussion 669-674

相似文献

5.

QOMA: quasi-optimal multiple alignment of protein sequences

Zhang X Kahveci T 《Bioinformatics (Oxford, England)》2007,23(2):162-168

MOTIVATION: We consider the problem of multiple alignment of protein sequences with the goal of achieving a large SP (Sum-of-Pairs) score. RESULTS: We introduce a new graph-based method. We name our method QOMA (Quasi-Optimal Multiple Alignment). QOMA starts with an initial alignment. It represents this alignment using a K-partite graph. It then improves the SP score of the initial alignment through local optimizations within a window that moves greedily on the alignment. QOMA uses two parameters to permit flexibility in time/accuracy trade off: (1) The size of the window for local optimization. (2) The sparsity of the K-partite graph. Unlike traditional progressive methods, QOMA is independent of the order of sequences. The experimental results on BAliBASE benchmarks show that QOMA produces higher SP score than the existing tools including ClustalW, Probcons, Muscle, T-Coffee and DCA. The difference is more significant for distant proteins. AVAILABILITY: The software is available from the authors upon request. 相似文献

6.

Improved spliced alignment from an information theoretic approach 总被引：4，自引：0，他引：4

Zhang M Gish W 《Bioinformatics (Oxford, England)》2006,22(1):13-20

相似文献

7.

A novel method of multiple alignment of biopolymer sequences

L.I. Brodsky A.L. Drachev A.M. Leontovich S.I. Feranchuk 《Bio Systems》1993,30(1-3):65-79

A novel algorithm for multiple alignment of biological sequences is suggested. At the first step the DotHelix procedure is employed for construction of motifs, i.e. continuous fragments of local similarity of various “thickness” and strength, and then these motifs are concatenated into chains consistent with the order of letters in the sequences. The algorithm is implemented in the MA-Tools program of the GeneBee package. An example illustrating the effectivity of the algorithm is presented. 相似文献

8.

Optimal alignment between groups of sequences and its application to multiple sequence alignment 总被引：11，自引：2，他引：11

Gotoh Osamu 《Bioinformatics (Oxford, England)》1993,9(3):361-370

Four algorithms, A–D, were developed to align two groupsof biological sequences. Algorithm A is equivalent to the conventionaldynamic programming method widely used for aligning ordinarysequences, whereas algorithms B – D are designed to evaluatethe cost for a deletion/insertion more accurately when internalgaps are present in either or both groups of sequences. Rigorousoptimization of the ‘sum of pairs’ (SP) score isachieved by algorithm D, whose average performance is closeto O(MNL²) where M and N are numbers of sequences included inthe two groups and L is the mean length of the sequences. AlgorithmB uses some app mximations to cope with profile-based operations,whereas algorithm C is a simpler variant of algorithm D. Thesegroup-to-group alignment algorithms were applied to multiplesequence alignment with two iterative strategies: a progressivemethod based on a given binary tree and a randomized grouping-realignmentmethod. The advantages and disadvantages of the four algorithmsare discussed on the basis of the results of exatninations ofseveral protein families. 相似文献

9.

Fast and sensitive multiple alignment of large genomic sequences

Michael?Brudno Email author Michael?Chapman Berthold?G?ttgens Serafim?Batzoglou Burkhard?Morgenstern Email author 《BMC bioinformatics》2003,4(1):66

Background

Genomic sequence alignment is a powerful method for genome analysis and annotation, as alignments are routinely used to identify functional sites such as genes or regulatory elements. With a growing number of partially or completely sequenced genomes, multiple alignment is playing an increasingly important role in these studies. In recent years, various tools for pair-wise and multiple genomic alignment have been proposed. Some of them are extremely fast, but often efficiency is achieved at the expense of sensitivity. One way of combining speed and sensitivity is to use an anchored-alignment approach. In a first step, a fast search program identifies a chain of strong local sequence similarities. In a second step, regions between these anchor points are aligned using a slower but more accurate method.

Results

Herein, we present CHAOS, a novel algorithm for rapid identification of chains of local pair-wise sequence similarities. Local alignments calculated by CHAOS are used as anchor points to improve the running time of DIALIGN, a slow but sensitive multiple-alignment tool. We show that this way, the running time of DIALIGN can be reduced by more than 95% for BAC-sized and longer sequences, without affecting the quality of the resulting alignments. We apply our approach to a set of five genomic sequences around the stem-cell-leukemia (SCL) gene and demonstrate that exons and small regulatory elements can be identified by our multiple-alignment procedure.

Conclusion

We conclude that the novel CHAOS local alignment tool is an effective way to significantly speed up global alignment tools such as DIALIGN without reducing the alignment quality. We likewise demonstrate that the DIALIGN/CHAOS combination is able to accurately align short regulatory sequences in distant orthologues.

相似文献

10.

MAP2: multiple alignment of syntenic genomic sequences

Ye L Huang X 《Nucleic acids research》2005,33(1):162-170

We describe a multiple alignment program named MAP2 based on a generalized pairwise global alignment algorithm for handling long, different intergenic and intragenic regions in genomic sequences. The MAP2 program produces an ordered list of local multiple alignments of similar regions among sequences, where different regions between local alignments are indicated by reporting only similar regions. We propose two similarity measures for the evaluation of the performance of MAP2 and existing multiple alignment programs. Experimental results produced by MAP2 on four real sets of orthologous genomic sequences show that MAP2 rarely missed a block of transitively similar regions and that MAP2 never produced a block of regions that are not transitively similar. Experimental results by MAP2 on six simulated data sets show that MAP2 found the boundaries between similar and different regions precisely. This feature is useful for finding conserved functional elements in genomic sequences. The MAP2 program is freely available in source code form at http://bioinformatics.iastate.edu/aat/sas.html for academic use. 相似文献

11.

COBALT: constraint-based alignment tool for multiple protein sequences

Papadopoulos JS Agarwala R 《Bioinformatics (Oxford, England)》2007,23(9):1073-1079

MOTIVATION: A tool that simultaneously aligns multiple protein sequences, automatically utilizes information about protein domains, and has a good compromise between speed and accuracy will have practical advantages over current tools. RESULTS: We describe COBALT, a constraint based alignment tool that implements a general framework for multiple alignment of protein sequences. COBALT finds a collection of pairwise constraints derived from database searches, sequence similarity and user input, combines these pairwise constraints, and then incorporates them into a progressive multiple alignment. We show that using constraints derived from the conserved domain database (CDD) and PROSITE protein-motif database improves COBALT's alignment quality. We also show that COBALT has reasonable runtime performance and alignment accuracy comparable to or exceeding that of other tools for a broad range of problems. AVAILABILITY: COBALT is included in the NCBI C++ toolkit. A Linux executable for COBALT, and CDD and PROSITE data used is available at: ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/cobalt 相似文献

12.

Mutual information content of homologous DNA sequences

Leitão HC Pessôa LS Stolfi J 《Genetics and molecular research : GMR》2005,4(3):553-562

The necessary information to reproduce and keep an organism is codified in acid nucleic molecules. Deepening the knowledge about how the information is stored in these bio-sequences can lead to more efficient methods of comparing genomic sequences. In the present study, we analyzed the quantity of information contained in a DNA sequence that can be useful to identify sequences homologous to it. To reach it, we used signal processing techniques, specially spectral analysis and information theory. 相似文献

13.

Align-m--a new algorithm for multiple alignment of highly divergent sequences 总被引：4，自引：0，他引：4

Van Walle I Lasters I Wyns L 《Bioinformatics (Oxford, England)》2004,20(9):1428-1435

MOTIVATION: Multiple alignment of highly divergent sequences is a challenging problem for which available programs tend to show poor performance. Generally, this is due to a scoring function that does not describe biological reality accurately enough or a heuristic that cannot explore solution space efficiently enough. In this respect, we present a new program, Align-m, that uses a non-progressive local approach to guide a global alignment. RESULTS: Two large test sets were used that represent the entire SCOP classification and cover sequence similarities between 0 and 50% identity. Performance was compared with the publicly available algorithms ClustalW, T-Coffee and DiAlign. In general, Align-m has comparable or slightly higher accuracy in terms of correctly aligned residues, especially for distantly related sequences. Importantly, it aligns much fewer residues incorrectly, with average differences of over 15% compared with some of the other algorithms. AVAILABILITY: Align-m and the test sets are available at http://bioinformatics.vub.ac.be 相似文献

14.

Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization

Markus Bauer Gunnar W Klau Knut Reinert 《BMC bioinformatics》2007,8(1):271

Background

The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. 相似文献

15.

MASH: an interactive program for multiple alignment and consensus sequence construction for biological sequences

Chappey C.; Danckaert A.; Dessen P.; Hazout S. 《Bioinformatics (Oxford, England)》1991,7(2):195-202

This paper presents a method for the multiple alignment of asequence set. The MASH algorithm uses a non-redundant databaseof common motifs and an ‘alignment priority’ criterionthat depends on the length and the occurrence frequency of thepatterns in the set of sequences. This user-defined criterionallows the determination of the series of the patterns to bealigned. This program is applied to a fragment of envelope geneenv gp120 for 20 isolates of the immunodeficiency virus. Themultiplicity of alignments obtained by modifying the criterionparameters reveals different aspects of similarity between thesequences. Received on June 4, 1990; accepted on December 14, 1990 相似文献

16.

A fast structural multiple alignment method for long RNA sequences

Yasuo Tabei Hisanori Kiryu Taishin Kin Kiyoshi Asai 《BMC bioinformatics》2008,9(1):1-17

Background

Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data.

Results

In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general.

Conclusion

GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values. 相似文献

17.

Computing recombination networks from binary sequences

Huson DH Kloepper TH 《Bioinformatics (Oxford, England)》2005,21(Z2):ii159-ii165

相似文献

18.

Bayesian estimation of substitution rates from ancient DNA sequences with low information content

Ho SY Lanfear R Phillips MJ Barnes I Thomas JA Kolokotronis SO Shapiro B 《Systematic biology》2011,60(3):366-375

相似文献

19.

SOMAP: a novel interactive approach to multiple protein sequences alignment

Parry-Smith D.J.; Attwood T.K. 《Bioinformatics (Oxford, England)》1991,7(2):233-235

A novel interactive method for generating multiple protein sequencealignments is described. The program has no internal limit tothe number or length of sequences it can handle and is designedfor use with DEC VAX processors running the VMS operating system.The approach used is essentially one of manual sequence manipulation,aided by built-in symbolic displays of identities and similarities,and strict and ‘fuzzy’ (ambiguous) pattern-matchingfacilities. Additional flexibility is provided by means of aninterface to a publicly available automatic alignment systemand to a comprehensive sequence analysis package. Received on August 28, 1990; accepted on November 20, 1990 相似文献

20.

Membrane probability profile construction based on amino acids sequences multiple alignment

Sutormin RA Mironov AA 《Molekuliarnaia biologiia》2006,40(3):541-545

Prediction of membrane segments in sequences of membrane proteins is well known and important problem. Accuracy of the solution of this problem by methods that don't use homology search in additional data bank can be improved. There is a lack of testing data in this area because of small amount of real structures of membrane proteins. In this work, we create a testing set of structural alignments of membrane proteins, in which positioning of the membrane segments reflects agreement of known 3D-structures of proteins in the alignment. We propose a method for predicting position of membrane segments in multiple alignment based on forward-backward algorithm from HMM theory. This method not only allows to predict positions of membrane segments but also forms probability membrane profile, which can be used in multiple alignment methods that take into account secondary structure information about sequences. Method is implemented in computer program available on the World-Wide Web site http://bioinf.fbb.msu.ru/fwdbck/. Proposed method provides results better than MEMSAT method, which is nearly only tool for prediction of membrane segments in multiple alignments without additional homology search. 相似文献