共查询到20条相似文献,搜索用时 31 毫秒
1.
Hollich V Milchert L Arvestad L Sonnhammer EL 《Molecular biology and evolution》2005,22(11):2257-2264
Distance-based methods are popular for reconstructing evolutionary trees of protein sequences, mainly because of their speed and generality. A number of variants of the classical neighbor-joining (NJ) algorithm have been proposed, as well as a number of methods to estimate protein distances. We here present a large-scale assessment of performance in reconstructing the correct tree topology for the most popular algorithms. The programs BIONJ, FastME, Weighbor, and standard NJ were run using 12 distance estimators, producing 48 tree-building/distance estimation method combinations. These were evaluated on a test set based on real trees taken from 100 Pfam families. Each tree was used to generate multiple sequence alignments with the ROSE program using three evolutionary models. The accuracy of each method was analyzed as a function of both sequence divergence and location in the tree. We found that BIONJ produced the overall best results, although the average accuracy differed little between the tree-building methods (normally less than 1%). A noticeable trend was that FastME performed poorer than the rest on long branches. Weighbor was several orders of magnitude slower than the other programs. Larger differences were observed when using different distance estimators. Protein-adapted Jukes-Cantor and Kimura distance correction produced clearly poorer results than the other methods, even worse than uncorrected distances. We also assessed the recently developed Scoredist measure, which performed equally well as more complex methods. 相似文献
2.
Shaw G 《BioTechniques》2000,28(6):1198-1201
Biologists today make extensive use of word processing programs for the production of research reports, literature reviews and grant proposals. Frequently, such programs become the default platform for viewing and the later publication of protein and nucleic acid sequence data. Thus, researchers often switch between their word processor and more specialized programs designed to analyze protein and nucleic acid sequences. It would be more convenient to perform these simple sequence analyses using the word processor without switching to another program. The focus here is on the use of the Visual Basic programming language, which is built into all recent versions of Microsoft Word to generate surprisingly complex and useful macros that can conveniently analyze several important features of protein and nucleic acid sequences. The standard Word interface can also be easily modified to display and run these macros from a pull-down menu. Several examples of this approach are provided. 相似文献
3.
Multiple sequence alignment with the Clustal series of programs 总被引:2,自引:0,他引:2
Chenna R Sugawara H Koike T Lopez R Gibson TJ Higgins DG Thompson JD 《Nucleic acids research》2003,31(13):3497-3500
The Clustal series of programs are widely used in molecular biology for the multiple alignment of both nucleic acid and protein sequences and for preparing phylogenetic trees. The popularity of the programs depends on a number of factors, including not only the accuracy of the results, but also the robustness, portability and user-friendliness of the programs. New features include NEXUS and FASTA format output, printing range numbers and faster tree calculation. Although, Clustal was originally developed to run on a local computer, numerous Web servers have been set up, notably at the EBI (European Bioinformatics Institute) (http://www.ebi.ac.uk/clustalw/). 相似文献
4.
Computer programs are described that aid in the design of synthetic genes coding for proteins that are targets of a research program in site directed mutagenesis. These programs "reverse-translate" protein sequences into general nucleic acid sequences (those where codons have not yet been selected), map restriction sites into general DNA sequences, identify points in the synthetic gene where unique restriction sites can be introduced, and assist in the design of genes coding for hybrids and evolutionary intermediates between homologous proteins. Application of these programs therefore facilitates the use of modular mutagenesis to create variants of proteins, and the implementation of evolutionary guidance as a strategy for selecting mutants. 相似文献
5.
The adequacy of various phenetic and phylogenetic estimation methods was evaluated using simulated data sets. Two parsimony programs were used to construct maximum parsimony trees (WAGNER 78 and HENNIG 86). The CAFCA program was used to perform group-compatibility analysis. Four UPGMA clustering strategies were employed. The simulation model GENESIS was used to generate data sets under different evolutionary conditions. The effects of input parameters and tree properties on the accuracy of the estimated trees were evaluated. UPGMA based on product moment correlations of unstandardized characters appeared to perform best, under all evolutionary conditions tested. The effect of input parameters on the accuracy was not very significant. Among the tree statistics the stemminess of the true tree appeared to be the most important estimator of accuracy. 相似文献
6.
This study describes novel algorithms for searching for most parsimonious trees. These algorithms are implemented as a parsimony computer program, PARSIGAL, which performs well even with difficult data sets. For high level search, PARSIGAL uses an evolutionary optimization algorithm, which feeds good tree candidates to a branch-swapping local search procedure. This study also describes an extremely fast method of recomputing state sets for binary characters (additive or nonadditive characters with two states), based on packing 32 characters into a single memory word and recomputing the tree simultaneously for all 32 characters using fast bitwise logical operations. The operational principles of PARSIGAL are quite different from those previously published for other parsimony computer programs. Hence it is conceivable that PARSIGAL may be able to locate islands of trees that are different from those that are easily located with existing parsimony computer programs. 相似文献
7.
Museums play a vitally important role in supporting both informal and formal education and are important venues for fostering
public understanding of evolution. The Yale Peabody Museum has implemented significant education programs on evolution for
many decades, mostly focused on the museum’s extensive collections that represent the past and present tree of life. Twelve
years ago, the Peabody began a series of new programs that explored biodiversity and evolution as it relates to human health.
Modern evolutionary theory contributes significantly to our understanding of health and disease, and medical topics provide
many excellent and relevant examples to explore evolutionary concepts. The Peabody developed a program on vector-borne diseases,
specifically Lyme disease and West Nile virus, which have become endemic in the United States. Both of these diseases have
complex transmission cycles involving an intricate interplay among the pathogen, host, and vector, each of which is subject
to differing evolutionary pressures. Using these stories, the museum explored evolutionary concepts of adaptation (e.g., the
evolution of blood feeding), coevolution (e.g., the “arms race” between host and vector), and variation and selection (e.g.,
antibiotic resistance) among others. The project included a temporary exhibition and the development of curriculum materials
for middle and high school teachers and students. The popularity of the exhibit and some formal evaluation of student participants
suggested that this educational approach has significant potential to engage wide audiences in evolutionary issues. In addition
it demonstrated how natural history museums can incorporate evolution into a broad array of programs. 相似文献
8.
9.
TOPALi v2: a rich graphical interface for evolutionary analyses of multiple alignments on HPC clusters and multi-core desktops 总被引:1,自引:0,他引:1
Milne I Lindner D Bayer M Husmeier D McGuire G Marshall DF Wright F 《Bioinformatics (Oxford, England)》2009,25(1):126-127
Summary: TOPALi v2 simplifies and automates the use of severalmethods for the evolutionary analysis of multiple sequence alignments.Jobs are submitted from a Java graphical user interface as TOPALiweb services to either run remotely on high-performance computingclusters or locally (with multiple cores supported). Methodsavailable include model selection and phylogenetic tree estimationusing the Bayesian inference and maximum likelihood (ML) approaches,in addition to recombination detection methods. The optimalsubstitution model can be selected for protein or nucleic acid(standard, or protein-coding using a codon position model) datausing accurate statistical criteria derived from ML co-estimationof the tree and the substitution model. Phylogenetic softwareavailable includes PhyML, RAxML and MrBayes. Availability: Freely downloadable from http://www.topali.orgfor Windows, Mac OS X, Linux and Solaris. Contact: iain.milne{at}scri.ac.uk
Associate Editor: Martin Bishop 相似文献
10.
DNAssist: the integrated editing and analysis of molecular biology sequences in windows 总被引:2,自引:0,他引:2
MOTIVATION: The programs currently available for the analysis of nucleic acid and protein sequences suffer from a variety of problems: Web-based programs often require inconvenient reformatting of sequences when proceeding from one analysis to the next, and commercial-console-based programs are cost prohibitive. Here, we report the development of DNASSIST:, an inexpensive, multiple-document, interface program for the fully integrated editing and analysis of nucleic acid and protein sequences in the familiar environment of Microsoft Windows. 相似文献
11.
A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony 总被引:24,自引:1,他引:23
The method of evolutionary parsimony--or operator invariants--is a
technique of nucleic acid sequence analysis related to parsimony analysis
and explicitly designed for determining evolutionary relationships among
four distantly related taxa. The method is independent of substitution
rates because it is derived from consideration of the group properties of
substitution operators rather than from an analysis of the probabilities of
substitution in branches of a tree. In both parsimony and evolutionary
parsimony, three patterns of nucleotide substitution are associated
one-to-one with the three topologically linked trees for four taxa. In
evolutionary parsimony, the three quantities are operator invariants. These
invariants are the remnants of substitutions that have occurred in the
interior branch of the tree and are analogous to the substitutions assigned
to the central branch by parsimony. The two invariants associated with the
incorrect trees must equal zero (statistically), whereas only the correct
tree can have a nonzero invariant. The chi 2-test is used to ascertain the
nonzero invariant and the statistically favored tree. Examples, obtained
using data calculated with evolutionary rates and branchings designed to
camouflage the true tree, show that the method accurately predicts the
tree, even when substitution rates differ greatly in neighboring peripheral
branches (conditions under which parsimony will consistently fail). As the
number of substitutions in peripheral branches becomes fewer, the parsimony
and the evolutionary-parsimony solutions converge. The method is robust and
easy to use.
相似文献
12.
Daniel H Huson Daniel C Richter Christian Rausch Tobias Dezulian Markus Franz Regula Rupp 《BMC bioinformatics》2007,8(1):460
Background
Research in evolution requires software for visualizing and editing phylogenetic trees, for increasingly very large datasets, such as arise in expression analysis or metagenomics, for example. It would be desirable to have a program that provides these services in an effcient and user-friendly way, and that can be easily installed and run on all major operating systems. Although a large number of tree visualization tools are freely available, some as a part of more comprehensive analysis packages, all have drawbacks in one or more domains. They either lack some of the standard tree visualization techniques or basic graphics and editing features, or they are restricted to small trees containing only tens of thousands of taxa. Moreover, many programs are diffcult to install or are not available for all common operating systems. 相似文献13.
14.
Efficient primer design algorithms 总被引:5,自引:0,他引:5
MOTIVATION: Primer design involves various parameters such as string-based alignment scores, melting temperature, primer length and GC content. This entails a design approach from multicriteria decision making. Values of some of the criteria are easy to compute while others require intense calculations. RESULTS: The reference point method was found to be tractable for trading-off between deviations from ideal values of all the criteria. Some criteria computations are based on dynamic programs with value iteration whose run time can be bounded by a low-degree polynomial. For designing standard PCR primers, the scheme offers in a relative gain in computing speed of up to 50: 1 over ad-hoc computational methods. Single PCR primer pairs have been used as model systems in order to simplify the quantization of the computational acceleration factors. The program has been structured so as to facilitate the analysis of large numbers of primer pairs with minor modifications. The scheme significantly increases primer design throughput which in turn facilitates the use of oligonucleotides in a wide range of applications including: multiplex PCR and other nucleic acid-based amplification systems, as well as in zip code targeting, oligonucleotide microarrays and nucleic acid-based nanoengineering. 相似文献
15.
Sadi MS Kuo FC Ho JW Charleston MA Chen TY 《Journal of bioinformatics and computational biology》2011,9(6):729-747
Many phylogenetic inference programs are available to infer evolutionary relationships among taxa using aligned sequences of characters, typically DNA or amino acids. These programs are often used to infer the evolutionary history of species. However, in most cases it is impossible to systematically verify the correctness of the tree returned by these programs, as the correct evolutionary history is generally unknown and unknowable. In addition, it is nearly impossible to verify whether any non-trivial tree is correct in accordance to the specification of the often complicated search and scoring algorithms. This difficulty is known as the oracle problem of software testing: there is no oracle that we can use to verify the correctness of the returned tree. This makes it very challenging to test the correctness of any phylogenetic inference programs. Here, we demonstrate how to apply a simple software testing technique, called Metamorphic Testing, to alleviate the oracle problem in testing phylogenetic inference programs. We have used both real and randomly generated test inputs to evaluate the effectiveness of metamorphic testing, and found that metamorphic testing can detect failures effectively in faulty phylogenetic inference programs with both types of test inputs. 相似文献
16.
Multilocus genomic data sets can be used to infer a rich set of information about the evolutionary history of a lineage, including gene trees, species trees, and phylogenetic networks. However, user‐friendly tools to run such integrated analyses are lacking, and workflows often require tedious reformatting and handling time to shepherd data through a series of individual programs. Here, we present a tool written in Python—TREEasy—that performs automated sequence alignment (with MAFFT), gene tree inference (with IQ‐Tree), species inference from concatenated data (with IQ‐Tree and RaxML‐NG), species tree inference from gene trees (with ASTRAL, MP‐EST, and STELLS2), and phylogenetic network inference (with SNaQ and PhyloNet). The tool only requires FASTA files and nine parameters as inputs. The tool can be run as command line or through a Graphical User Interface (GUI). As examples, we reproduced a recent analysis of staghorn coral evolution, and performed a new analysis on the evolution of the “WGD clade” of yeast. The latter revealed novel patterns that were not identified by previous analyses. TREEasy represents a reliable and simple tool to accelerate research in systematic biology ( https://github.com/MaoYafei/TREEasy ). 相似文献
17.
Each amino acid in a protein is considered to be an individual, mutable characteristic of the species from which the protein is extracted. For a branching tree representing the evolutionary history of the known sequences in different species, our computer programs use majority logic and parsimony of mutations to determine the most likely ancestral amino acid for each position of the protein at each node of the tree. The number of mutations necessary between the ancestral and present species is summed for each branch and the entire tree. The programs then move branches to make many different configurations, from which we select the one with the minimum number of mutations as the most likely evolutionary history. We used this method to elucidate primate phylogeny from sequences of fibrinopeptides, carbonic anhydrase, and the hemoglobin beta, delta and alpha chains. All available sequences indicate that the early Pongidae had diverged into two lines before the divergence of an ancestor for the human line alone. We have constructed some probable ancestral sequences at major points during primate evolution and have developed tentative trees showing the order of divergences and evolutionary distances among primate groups. Further questions on primate evolution could be answered in the future by the detemination of the appropriate sequences. 相似文献
18.
Methods for inferring phylogenies from nucleic acid sequence data by using maximum likelihood and linear invariants 总被引:3,自引:2,他引:1
Likelihood methods and methods using invariants are procedures for inferring the evolutionary relationships among species through statistical analysis of nucleic acid sequences. A likelihood-ratio test may be used to determine the feasibility of any tree for which the maximum likelihood can be computed. The method of linear invariants described by Cavender, which includes Lake's method of evolutionary parsimony as a special case, is essentially a form of the likelihood-ratio method. In the case of a small number of species (four or five), these methods may be used to find a confidence set for the correct tree. An exact version of Lake's asymptotic chi 2 test has been mentioned by Holmquist et al. Under very general assumptions, a one-sided exact test is appropriate, which greatly increases power. 相似文献
19.
A comprehensive DNA analysis computer program was described in the second special issue of Nucleic Acids Research on the applications of computers to research on nucleic acids by Stone and Potter (1). Criteria used in designing the program were user friendliness, ability to handle large DNA sequences, low storage requirement, migratability to other computers and comprehensive analysis capability. The program has been used extensively in an industrial-research environment. This paper talks about improvements to that program. These improvements include testing for methylation blockage of restriction enzyme recognition sites, homology analysis, RNA folding analysis, integration of a large DNA database (GenBank), a site specific mutagenesis analysis, a protein database and protein searching programs. The original design of the DNA analysis program using a command executive from which any analytical programs can be called, has proven to be extremely versatile in integrating both developed and outside programs to the file management system employed. 相似文献
20.
Interactive analysis of phylogeny and character evolution using the computer program MacClade 总被引:15,自引:0,他引:15
W P Maddison D R Maddison 《Folia primatologica; international journal of primatology》1989,53(1-4):190-202
Computer programs for phylogenetic analysis have been important tools in systematics and evolutionary biology, but most have been designed primarily for the reconstruction of phylogenetic trees and not the interpretation of patterns of character evolution. Described here is the computer program MacClade, designed for interactive analysis of character evolution and phylogeny. For a given tree and a matrix of character data, MacClade displays its reconstruction of character evolution by shading the branches of the tree to indicate ancestral states. Trees can be manipulated for instance by picking up and moving branches. Assumptions underlying the reconstruction of character evolution can be varied extensively. With these manipulations and MacClade's graphical feedback, one can explore the relationships among phylogenetic trees, character data, assumptions and interpretations of character evolution. MacClade has extensive facilities for editing data, displaying various summaries of character evolution in charts and diagrams, and printing. 相似文献