期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

FASMA: a service to format and analyze sequences in multiple alignments

Costantini S Colonna G Facchiano AM 《基因组蛋白质组与生物信息学报(英文版)》2007,5(3-4):253-255

Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/. 相似文献

2.

FASMA： A Service to Format and Analyze Sequences in Multiple Alignments

Susan Costantini Giovanni Colonna Angelo M. Facchiano 《基因组蛋白质组与生物信息学报(英文版)》2007,2(3):253-255

Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and pro- tein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http: //bioinformatica.isa.cnr.it /FASMA /. 相似文献

3.

<Emphasis Type="Italic">Phylo-mLogo</Emphasis>: an interactive and hierarchical multiple-logo visualization tool for alignment of many sequences

Arthur Chun-Chieh Shih DT Lee Chin-Lin Peng Yu-Wei Wu 《BMC bioinformatics》2007,8(1):63

Background

When aligning several hundreds or thousands of sequences, such as epidemic virus sequences or homologous/orthologous sequences of some big gene families, to reconstruct the epidemiological history or their phylogenies, how to analyze and visualize the alignment results of many sequences has become a new challenge for computational biologists. Although there are several tools available for visualization of very long sequence alignments, few of them are applicable to the alignments of many sequences. 相似文献

4.

ProtEST: protein multiple sequence alignments from expressed sequence tags

Cuff JA Birney E Clamp ME Barton GJ 《Bioinformatics (Oxford, England)》2000,16(2):111-116

相似文献

5.

Towards a reliable objective function for multiple sequence alignments.

J D Thompson F Plewniak R Ripp J C Thierry O Poch 《Journal of molecular biology》2001,314(4):937-951

Multiple sequence alignment is a fundamental tool in a number of different domains in modern molecular biology, including functional and evolutionary studies of a protein family. Multiple alignments also play an essential role in the new integrated systems for genome annotation and analysis. Thus, the development of new multiple alignment scores and statistics is essential, in the spirit of the work dedicated to the evaluation of pairwise sequence alignments for database searching techniques. We present here norMD, a new objective scoring function for multiple sequence alignments. NorMD combines the advantages of the column-scoring techniques with the sensitivity of methods incorporating residue similarity scores. In addition, norMD incorporates ab initio sequence information, such as the number, length and similarity of the sequences to be aligned. The sensitivity and reliability of the norMD objective function is demonstrated using structural alignments in the SCOP and BAliBASE databases. The norMD scores are then applied to the multiple alignments of the complete sequences (MACS) detected by BlastP with E-value<10, for a set of 734 hypothetical proteins encoded by the Vibrio cholerae genome. Unrelated or badly aligned sequences were automatically removed from the MACS, leaving a high-quality multiple alignment which could be reliably exploited in a subsequent functional and/or structural annotation process. After removal of unreliable sequences, 176 (24 %) of the alignments contained at least one sequence with a functional annotation. 103 of these new matches were supported by significant hits to the Interpro domain and motif database. 相似文献

6.

Color and graphic display (CGD): programs for multiple sequence alignment analysis in spreadsheet software

Delamarche C 《BioTechniques》2000,29(1):100-4, 106-7

Interpretation of multiple sequence alignments is of major interest for the prediction of functional and structural domains in proteins or for the organization of related sequences in families and subfamilies. However, a necessity for the bench scientist is the use of outstanding programs in a friendly computing environment. This paper describes Color and Graphic Display (CGD), a set of modules that runs as part of the Microsoft Excel spreadsheet to color and analyze multiple sequence alignments. Discussed here are the main functions of CGD and the use of the program to highlight residues of importance in a water channel family. Although CGD was created for protein sequences, most of the modules are compatible with DNA sequences. 相似文献

7.

Phylo-VISTA: interactive visualization of multiple DNA sequence alignments

Shah N Couronne O Pennacchio LA Brudno M Batzoglou S Bethel EW Rubin EM Hamann B Dubchak I 《Bioinformatics (Oxford, England)》2004,20(5):636-643

MOTIVATION: The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. RESULTS: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a framework based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. AVAILABILITY: Phylo-VISTA is available at http://www-gsd.lbl.gov/phylovista. It requires an Internet browser with Java Plug-in 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu 相似文献

8.

CLANS: a Java application for visualizing protein families based on pairwise similarity 总被引：2，自引：0，他引：2

Frickey T Lupas A 《Bioinformatics (Oxford, England)》2004,20(18):3702-3704

SUMMARY: The main source of hypotheses on the structure and function of new proteins is their homology to proteins with known properties. Homologous relationships are typically established through sequence similarity searches, multiple alignments and phylogenetic reconstruction. In cases where the number of potential relationships is large, for example in P-loop NTPases with many thousands of members, alignments and phylogenies become computationally demanding, accumulate errors and lose resolution. In search of a better way to analyze relationships in large sequence datasets we have developed a Java application, CLANS (CLuster ANalysis of Sequences), which uses a version of the Fruchterman-Reingold graph layout algorithm to visualize pairwise sequence similarities in either two-dimensional or three-dimensional space. AVAILABILITY: CLANS can be downloaded at http://protevo.eb.tuebingen.mpg.de/download. 相似文献

9.

No so HoT – heads or tails is not able to reliably compare multiple sequence alignments

Michael J. Wise 《Cladistics : the international journal of the Willi Hennig Society》2010,26(4):438-443

Most phylogenetic‐tree building applications use multiple sequence alignments as a starting point. A recent meta‐level methodology, called Heads or Tails, aims to reveal the quality of multiple sequence alignments by comparing alignments taken in the forward direction with the alignments of the same sequences when the sequences are reversed. Through an examination of a special case for multiple sequence alignment – pair‐wise alignments, where an optimal algorithm exists – and the use of a modi?ed global‐alignment application, it is shown that the forward and reverse alignments, even when they are the same, do not capture all the possible variations in the alignments and when the forward and reverse alignments differ there may be other alignments that remain unaccounted for. The implication is that comparing just the forward and (biologically irrelevant) reverse alignments is not sufficient to capture the variability in multiple sequence alignments, and the Heads or Tails methodology is therefore not suitable as a method for investigating multiple sequence alignment accuracy. Part of the reason is the inability of individual multiple sequence alignment applications to adequately sample the space of possible alignments. A further implication is that the Hall [Hall, B.G., 2008. Mol. Biol. Evol. 25, 1576–1580] methodology may create optimal synthetic multiple sequence alignments that extant aligners will be unable to completely recover ab initio due to alternative alignments being possible at particular sites. In general, it is shown that more divergent sequences will give rise to an increased number of alternative alignments, so sequence sets with a higher degree of similarity are preferable to sets with lower similarity as the starting point for phylogenetic tree building. © The Willi Hennig Society 2009. 相似文献

10.

Histone Sequence Database: new histone fold family members. 总被引：2，自引：0，他引：2

下载免费PDF全文

A D Baxevanis D Landsman 《Nucleic acids research》1998,26(1):372-375

Searches of the major public protein databases with core and linker chicken and human histone sequences have resulted in the compilation of an annotated set of histone protein sequences. In addition, new database searches with two distinct motif search algorithms have identified several members of the histone fold family, including human DRAP1 and yeast CSE4. Database resources include information on conflicts between similar sequence entries in different source databases, multiple sequence alignments, links to the Entrez integrated information retrieval system, structures for histone and histone fold proteins, and the ability to visualize structural data through Cn3D. The database currently contains >1000 protein sequences, which are searchable by protein type, accession number, organism name, or any other free text appearing in the definition line of the entry. All sequences and alignments in this database are available through the World Wide Web at http://www.nhgri.nih. gov/DIR/GTB/HISTONES or http://www.ncbi.nlm.nih. gov/Baxevani/HISTONES 相似文献

11.

ProDom: automated clustering of homologous domains

Servant F Bru C Carrère S Courcelle E Gouzy J Peyruc D Kahn D 《Briefings in bioinformatics》2002,3(3):246-251

The ProDom database is a comprehensive set of protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases. An associated database, ProDom-CG, has been derived as a restriction of ProDom to completely sequenced genomes. The ProDom construction method is based on iterative PSI-BLAST searches and multiple alignments are generated for each domain family. The ProDom web server provides the user with a set of tools to visualise multiple alignments, phylogenetic trees and domain architectures of proteins, as well as a BLAST-based server to analyse new sequences for homologous domains. The comprehensive nature of ProDom makes it particularly useful to help sustain the growth of InterPro. 相似文献

12.

PROMALS3D: a tool for multiple protein sequence and structure alignments 总被引：4，自引：1，他引：3

下载免费PDF全文

Pei J Kim BH Grishin NV 《Nucleic acids research》2008,36(7):2295-2300

Although multiple sequence alignments (MSAs) are essential for a wide range of applications from structure modeling to prediction of functional sites, construction of accurate MSAs for distantly related proteins remains a largely unsolved problem. The rapidly increasing database of spatial structures is a valuable source to improve alignment quality. We explore the use of 3D structural information to guide sequence alignments constructed by our MSA program PROMALS. The resulting tool, PROMALS3D, automatically identifies homologs with known 3D structures for the input sequences, derives structural constraints through structure-based alignments and combines them with sequence constraints to construct consistency-based multiple sequence alignments. The output is a consensus alignment that brings together sequence and structural information about input proteins and their homologs. PROMALS3D can also align sequences of multiple input structures, with the output representing a multiple structure-based alignment refined in combination with sequence constraints. The advantage of PROMALS3D is that it gives researchers an easy way to produce high-quality alignments consistent with both sequences and structures of proteins. PROMALS3D outperforms a number of existing methods for constructing multiple sequence or structural alignments using both reference-dependent and reference-independent evaluation methods. 相似文献

13.

Alignment of protein sequences by their profiles 总被引：7，自引：0，他引：7

Marti-Renom MA Madhusudhan MS Sali A 《Protein science : a publication of the Protein Society》2004,13(4):1071-1087

The accuracy of an alignment between two protein sequences can be improved by including other detectably related sequences in the comparison. We optimize and benchmark such an approach that relies on aligning two multiple sequence alignments, each one including one of the two protein sequences. Thirteen different protocols for creating and comparing profiles corresponding to the multiple sequence alignments are implemented in the SALIGN command of MODELLER. A test set of 200 pairwise, structure-based alignments with sequence identities below 40% is used to benchmark the 13 protocols as well as a number of previously described sequence alignment methods, including heuristic pairwise sequence alignment by BLAST, pairwise sequence alignment by global dynamic programming with an affine gap penalty function by the ALIGN command of MODELLER, sequence-profile alignment by PSI-BLAST, Hidden Markov Model methods implemented in SAM and LOBSTER, pairwise sequence alignment relying on predicted local structure by SEA, and multiple sequence alignment by CLUSTALW and COMPASS. The alignment accuracies of the best new protocols were significantly better than those of the other tested methods. For example, the fraction of the correctly aligned residues relative to the structure-based alignment by the best protocol is 56%, which can be compared with the accuracies of 26%, 42%, 43%, 48%, 50%, 49%, 43%, and 43% for the other methods, respectively. The new method is currently applied to large-scale comparative protein structure modeling of all known sequences. 相似文献

14.

Statistical significance in biological sequence analysis

Mitrophanov AY Borodovsky M 《Briefings in bioinformatics》2006,7(1):2-24

相似文献

15.

SynBrowse: a synteny browser for comparative sequence analysis

Pan X Stein L Brendel V 《Bioinformatics (Oxford, England)》2005,21(17):3461-3468

MOTIVATION: The recent efforts of various sequence projects to sequence deeply into various phylogenies provide great resources for comparative sequence analysis. A generic and portable tool is essential for scientists to visualize and analyze sequence comparisons. RESULTS: We have developed SynBrowse, a synteny browser for visualizing and analyzing genome alignments both within and between species. It is intended to help scientists study macrosynteny, microsynteny and homologous genes between sequences. It can also aid with the identification of uncharacterized genes, putative regulatory elements and novel structural features of a species. SynBrowse is a GBrowse (the Generic Genome Browser) family software tool that runs on top of the open source BioPerl modules. It consists of two components: a web-based front end and a set of relational database back ends. Each database stores pre-computed alignments from a focus sequence to reference sequences in addition to the genome annotations of the focus sequence. The user interface lets end users select a key comparative alignment type and search for syntenic blocks between two sequences and zoom in to view the relationships among the corresponding genome annotations in detail. SynBrowse is portable with simple installation, flexible configuration, convenient data input and easy integration with other components of a model organism system. AVAILABILITY: The software is available at http://www.gmod.org CONTACT: vbrendel@iastate.edu 相似文献

16.

A reality check for alignments and trees

Martin W Roettger M Lockhart PJ 《Trends in genetics : TIG》2007,23(10):478-480

Making multiple sequence alignments is one of the more commonplace procedures in modern biology. Multiple alignments are typically generated by feeding sequences into the alignment program from the N-terminus to the C-terminus. Recent results show that if the same sequences are processed from the C- to the N-terminus, a different alignment is often obtained. Because phylogenetic trees are built from alignments, the resulting trees can also differ. The new findings highlight sequence alignment as a crucial step in molecular evolutionary studies and provide straightforward measures to assess alignment reliability. 相似文献

17.

DNA reference alignment benchmarks based on tertiary structure of encoded proteins 总被引：1，自引：0，他引：1

Carroll H Beckstead W O'Connor T Ebbert M Clement M Snell Q McClellan D 《Bioinformatics (Oxford, England)》2007,23(19):2648-2649

MOTIVATION: Multiple sequence alignments (MSAs) are at the heart of bioinformatics analysis. Recently, a number of multiple protein sequence alignment benchmarks (i.e. BAliBASE, OXBench, PREFAB and SMART) have been released to evaluate new and existing MSA applications. These databases have been well received by researchers and help to quantitatively evaluate MSA programs on protein sequences. Unfortunately, analogous DNA benchmarks are not available, making evaluation of MSA programs difficult for DNA sequences. RESULTS: This work presents the first known multiple DNA sequence alignment benchmarks that are (1) comprised of protein-coding portions of DNA (2) based on biological features such as the tertiary structure of encoded proteins. These reference DNA databases contain a total of 3545 alignments, comprising of 68 581 sequences. Two versions of the database are available: mdsa_100s and mdsa_all. The mdsa_100s version contains the alignments of the data sets that TBLASTN found 100% sequence identity for each sequence. The mdsa_all version includes all hits with an E-value score above the threshold of 0.001. A primary use of these databases is to benchmark the performance of MSA applications on DNA data sets. The first such case study is included in the Supplementary Material. 相似文献

18.

3DCoffee: combining protein sequences and structures within multiple sequence alignments

O'Sullivan O Suhre K Abergel C Higgins DG Notredame C 《Journal of molecular biology》2004,340(2):385-395

Most bioinformatics analyses require the assembly of a multiple sequence alignment. It has long been suspected that structural information can help to improve the quality of these alignments, yet the effect of combining sequences and structures has not been evaluated systematically. We developed 3DCoffee, a novel method for combining protein sequences and structures in order to generate high-quality multiple sequence alignments. 3DCoffee is based on TCoffee version 2.00, and uses a mixture of pairwise sequence alignments and pairwise structure comparison methods to generate multiple sequence alignments. We benchmarked 3DCoffee using a subset of HOMSTRAD, the collection of reference structural alignments. We found that combining TCoffee with the threading program Fugue makes it possible to improve the accuracy of our HOMSTRAD dataset by four percentage points when using one structure only per dataset. Using two structures yields an improvement of ten percentage points. The measures carried out on HOM39, a HOMSTRAD subset composed of distantly related sequences, show a linear correlation between multiple sequence alignment accuracy and the ratio of number of provided structure to total number of sequences. Our results suggest that in the case of distantly related sequences, a single structure may not be enough for computing an accurate multiple sequence alignment. 相似文献

19.

The RNA structure alignment ontology

James W. Brown Amanda Birmingham Paul E. Griffiths Fabrice Jossinet Rym Kachouri-Lafond Rob Knight B. Franz Lang Neocles Leontis Gerhard Steger Jesse Stombaugh Eric Westhof 《RNA (New York, N.Y.)》2009,15(9):1623-1631

Multiple sequence alignments are powerful tools for understanding the structures, functions, and evolutionary histories of linear biological macromolecules (DNA, RNA, and proteins), and for finding homologs in sequence databases. We address several ontological issues related to RNA sequence alignments that are informed by structure. Multiple sequence alignments are usually shown as two-dimensional (2D) matrices, with rows representing individual sequences, and columns identifying nucleotides from different sequences that correspond structurally, functionally, and/or evolutionarily. However, the requirement that sequences and structures correspond nucleotide-by-nucleotide is unrealistic and hinders representation of important biological relationships. High-throughput sequencing efforts are also rapidly making 2D alignments unmanageable because of vertical and horizontal expansion as more sequences are added. Solving the shortcomings of traditional RNA sequence alignments requires explicit annotation of the meaning of each relationship within the alignment. We introduce the notion of “correspondence,” which is an equivalence relation between RNA elements in sets of sequences as the basis of an RNA alignment ontology. The purpose of this ontology is twofold: first, to enable the development of new representations of RNA data and of software tools that resolve the expansion problems with current RNA sequence alignments, and second, to facilitate the integration of sequence data with secondary and three-dimensional structural information, as well as other experimental information, to create simultaneously more accurate and more exploitable RNA alignments. 相似文献

20.

Quality assessment of multiple alignment programs 总被引：7，自引：0，他引：7

Lassmann T Sonnhammer EL 《FEBS letters》2002,529(1):126-130

A renewed interest in the multiple sequence alignment problem has given rise to several new algorithms. In contrast to traditional progressive methods, computationally expensive score optimization strategies are now predominantly employed. We systematically tested four methods (Poa, Dialign, T-Coffee and ClustalW) for the speed and quality of their alignments. As test sequences we used structurally derived alignments from BAliBASE and synthetic alignments generated by Rose. The tests included alignments of variable numbers of domains embedded in random spacer sequences. Overall, Dialign was the most accurate in cases with low sequence identity, while T-Coffee won in cases with high sequence identity. The fast Poa algorithm was almost as accurate, while ClustalW could compete only in strictly global cases with high sequence similarity. 相似文献