期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Identification of phylogenetic trees of minimal length.

M D Hendy D Penny L R Foulds 《Journal of theoretical biology》1978,71(3):441-452

The problem of determining an optimal phylogenetic tree from a set of data is an example of the Steiner problem in graphs. There is no efficient algorithm for solving this problem with reasonably large data sets. In the present paper an approach is described that proves in some cases that a given tree is optimal without testing all possible trees. The method first uses a previously described heuristic algorithm to find a tree of relatively small total length. The second part of the method independently analyses subsets of sites to determine a lower bound on the length of any tree. We simultaneously attempt to reduce the total length of the tree and increase the lower bound. When these are equal it is not possible to make a shorter tree with a given data set and given criterion. An example is given where the only two possible minimal trees are found for twelve different mammalian cytochrome c sequences. The criterion of finding the smallest number of minimum base changes was used. However, there is no general method of guaranteeing that a solution will be found in all cases and in particular better methods of improving the estimate of the lower bound need to be developed. 相似文献

2.

Taxonomic colouring of phylogenetic trees of protein sequences

Gareth Palidwor Emmanuel G Reynaud Miguel A Andrade-Navarro 《BMC bioinformatics》2006,7(1):79-4

Background

Phylogenetic analyses of protein families are used to define the evolutionary relationships between homologous proteins. The interpretation of protein-sequence phylogenetic trees requires the examination of the taxonomic properties of the species associated to those sequences. However, there is no online tool to facilitate this interpretation, for example, by automatically attaching taxonomic information to the nodes of a tree, or by interactively colouring the branches of a tree according to any combination of taxonomic divisions. This is especially problematic if the tree contains on the order of hundreds of sequences, which, given the accelerated increase in the size of the protein sequence databases, is a situation that is becoming common. 相似文献

3.

A graph theoretic approach to the development of minimal phylogenetic trees

L. R. Foulds M. D. Hendy David Penny 《Journal of molecular evolution》1979,13(2):127-149

Summary The problem of determining the minimal phylogenetic tree is discussed in relation to graph theory. It is shown that this problem is an example of the Steiner problem in graphs which is to connect a set of points by a minimal length network where new points can be added. There is no reported method of solving realistically-sized Steiner problems in reasonable computing time. A heuristic method of approaching the phylogenetic problem is presented, together with a worked example with 7 mammalian cytochrome c sequences. It is shown in this case that the method develops a phylogenetic tree that has the smallest possible number of amino acid replacements. The potential and limitations of the method are discussed. It is stressed that objective methods must be used for comparing different trees. In particular it should be determined how close a given tree is to a mathematically determined lower bound. A theorem is proved which is used to establish a lower bound on the length of any tree and if a tree is found with a length equal to the lower bound, then no shorter tree can exist. 相似文献

4.

A fast method for approximating maximum likelihoods of phylogenetic trees from nucleotide sequences

Rogers JS Swofford DL 《Systematic biology》1998,47(1):77-89

We have developed a rapid parsimony method for reconstructing ancestral nucleotide states that allows calculation of initial branch lengths that are good approximations to optimal maximum-likelihood estimates under several commonly used substitution models. Use of these approximate branch lengths (rather than fixed arbitrary values) as starting points significantly reduces the time required for iteration to a solution that maximizes the likelihood of a tree. These branch lengths are close enough to the optimal values that they can be used without further iteration to calculate approximate maximum-likelihood scores that are very close to the "exact" scores found by iteration. Several strategies are described for using these approximate scores to substantially reduce times needed for maximum-likelihood tree searches. 相似文献

5.

On the consistency of maximum likelihood estimation of phylogenetic trees from nucleotide sequences

Rogers JS 《Systematic biology》1997,46(2):354-357

相似文献

6.

Using genetic algorithms for the construction of phylogenetic trees: application to G-protein coupled receptor sequences 总被引：1，自引：0，他引：1

Reijmers TH Wehrens R Daeyaert FD Lewi PJ Buydens LM 《Bio Systems》1999,49(1):31-43

Many different phylogenetic clustering techniques are used currently. One approach is to first determine the topology with a common clustering method and then calculate the branch lengths of the tree. If the resulting tree is not optimal exchanging tree branches can make some local changes in the tree topology. The whole process can be iterated until a satisfactory result has been obtained. The efficiency of this method fully depends on the initially generated tree. Although local changes are made, the optimal tree will never be found if the initial tree is poorly chosen. In this article, genetic algorithms are applied such that the optimal tree can be found even with a bad initial tree topology. This tree generating method is tested by comparing its results with the results of the FITCH program in the PHYLIP software package. Two simulated data sets and a real data set are used. 相似文献

7.

A general approach to proving the minimality of phylogenetic trees illustrated by an example with a set of 23 vertebrates

L. R. Foulds David Penny M. D. Hendy 《Journal of molecular evolution》1979,13(2):151-166

Summary We have recently described a method of building phylogenetic trees and have outlined an approach for proving whether a particular tree is optimal for the data used. In this paper we describe in detail the method of establishing lower bounds on the length of a minimal tree by partitioning the data set into subsets. All characters that could be involved in duplications in the data are paired with all other such characters. A matching algorithm is then used to obtain the pairing of characters that reveals the most duplications in the data. This matching may still not account for all nucleotide substitutions on the tree. The structure of the tree is then used to help select subsets of three or more. characters until the lower bound found by partitioning is equal to the length of the tree. The tree must then be a minimal tree since no tree can exist with a length less than that of the lower bound.The method is demonstrated using a set of 23 vertebrate cytochrome c sequences with the criterion of minimizing the total number of nucleotide substitutions. There are 131130 7045768798 9603440625 topologically distinct trees that can be constructed from this data set. The method described in this paper does identify 144 minimal tree variants. The method is general in the sense that it can be used for other data and other criteria of length. It need not however always be possible to prove a tree minimal but the method will give an upper and lower bound on the length of minimal trees. 相似文献

8.

fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood 总被引：39，自引：1，他引：38

Olsen Gary J.; Matsuda Hideo; Hagstrom Ray; Overbeek Ross 《Bioinformatics (Oxford, England)》1994,10(1):41-48

We have developed a new tool, called fastDNAml, for constructingphylogenetic trees from DNA sequences. The program can be runon a wide variety of computers ranging from Unix workstationsto massively parallel systems, and is available from the RibosomalDatabase Project (RDP) by anonymous FTP. Our program uses amaximum likelihood approach and is based on version 3.3 of Felsenstein'sdnaml program. Several enhancements, including algorithmic changes,significantly improve performance and reduce memory usage, makingit feasible to construct even very large trees. Trees containing40–100 taxa have been easily generated, and phylogeneticestimates are possible even when hundreds of sequences exist.We are currently using the tool to construct a phylogenetictree based on 473 small subunit rRNA sequences from prokaryotes. 相似文献

9.

Testing phylogenetic approaches with empirical data, as illustrated with the parsimony method.

M W Allard M M Miyamoto 《Molecular biology and evolution》1992,9(5):778-786

相似文献

10.

MAC5: Bayesian inference of phylogenetic trees from DNA sequences incorporating gaps

McGuire G Denham MC Balding DJ 《Bioinformatics (Oxford, England)》2001,17(5):479-480

SUMMARY: MAC5 implements MCMC sampling of the posterior distribution of tree topologies from DNA sequences containing gaps by using a five state model of evolution (the four nucleotides and the gap character). 相似文献

11.

Estimation of evolutionary parameters with phylogenetic trees

Wang Q Salter LA Pearl DK 《Journal of molecular evolution》2002,55(6):684-695

An important issue in the phylogenetic analysis of nucleotide sequence data using the maximum likelihood (ML) method is the underlying evolutionary model employed. We consider the problem of simultaneously estimating the tree topology and the parameters in the underlying substitution model and of obtaining estimates of the standard errors of these parameter estimates. Given a fixed tree topology and corresponding set of branch lengths, the ML estimates of standard evolutionary model parameters are asymptotically efficient, in the sense that their joint distribution is asymptotically normal with the variance–covariance matrix given by the inverse of the Fisher information matrix. We propose a new estimate of this conditional variance based on estimation of the expected information using a Monte Carlo sampling (MCS) method. Simulations are used to compare this conditional variance estimate to the standard technique of using the observed information under a variety of experimental conditions. In the case in which one wishes to estimate simultaneously the tree and parameters, we provide a bootstrapping approach that can be used in conjunction with the MCS method to estimate the unconditional standard error. The methods developed are applied to a real data set consisting of 30 papillomavirus sequences. This overall method is easily incorporated into standard bootstrapping procedures to allow for proper variance estimation. 相似文献

12.

Confidence limits on the branching order of phylogenetic trees.

A. C. Shearer M. S. Johnson 《Protein science : a publication of the Protein Society》1993,2(10):1686-1696

We describe a confidence test for branching order that can aid protein phylogeny reconstruction as well as the evaluation of the optimal tree. It is proposed that the process resulting in the observed amino acid residue differences, which is the basis for the identification of the order and relative times of divergence events, is appropriately described by a modification of the negative binomial distribution. The relative total numbers of mutations (accepted and nonaccepted), which result in a given number of amino acid differences, may be obtained as the expectation of this distribution. The associated variances enable significant differences in tree branching order to be established. If the total rates of mutation of the genes encoding the compared proteins are equal, the expected total mutations and their associated variances map identically to their relative times of divergence. In addition, significantly different rates of change (due to differences in total mutation rate and/or acceptance rate) may be identified without the requirement of outlying reference group. The method is equally applicable to phylogenies derived from DNA or RNA sequence information. 相似文献

13.

Statistical measures of uncertainty for branches in phylogenetic trees inferred from molecular sequences by using model-based methods

Wróbel B 《Journal of applied genetics》2008,49(1):49-67

In recent years, the emphasis of theoretical work on phylogenetic inference has shifted from the development of new tree inference methods to the development of methods to measure the statistical support for the topologies. This paper reviews 3 approaches to assign support values to branches in trees obtained in the analysis of molecular sequences: the bootstrap, the Bayesian posterior probabilities for clades, and the interior branch tests. In some circumstances, these methods give different answers. It should not be surprising: their assumptions are different. Thus the interior branch tests assume that a given topology is true and only consider if a particular branch length is longer than zero. If a tree is incorrect, a wrong branch (a low bootstrap or Bayesian support may be an indication) may have a non-zero length. If the substitution model is oversimplified, the length of a branch may be overestimated, and the Bayesian support for the branch may be inflated. The bootstrap, on the other hand, approximates the variance of the data under the real model of sequence evolution, because it involves direct resampling from this data. Thus the discrepancy between the Bayesian support and the bootstrap support may signal model inaccuracy. In practical application, use of all 3 methods is recommended, and if discrepancies are observed, then a careful analysis of their potential origins should be made. 相似文献

14.

Stability of characters and construction of phylogenetic trees.

Teresa Przytycka 《Journal of computational biology》2007,14(5):539-549

Parsimony methods infer phylogenetic trees by minimizing number of character changes required to explain observed character states. From the perspective of applicability of parsimony methods, it is important to assess whether the characters used to infer phylogeny are likely to provide a correct tree. We introduce a graph theoretical characterization that helps to assess whether given set of characters is appropriate to use with parsimony methods. Given a set of characters and a set of taxa, we construct a network called character overlap graph. We show that the character overlap graph for characters that are appropriate to use in parsimony methods is characterized by significant under-representation of subnetworks known as holes, and provide a validation for this observation. This characterization explains success in constructing evolutionary trees using parsimony method for some characters (e.g., protein domains) and lack of such success for other characters (e.g., introns). In the latter case, the understanding of obstacles to applying parsimony methods in a direct way has lead us to a new approach for detecting inconsistent and/or noisy data. Namely, we introduce the concept of stable characters which is similar but less restrictive than the well known concept of pairwise compatible characters. Application of this approach to introns produces the evolutionary tree consistent with the Coelomata hypothesis. 相似文献

15.

Studying the evolutionary relationships and phylogenetic trees of 21 groups of tRNA sequences based on complex networks

Fangping Wei Bowen Chen 《Journal of biological physics》2012,38(2):241-250

To find out the evolutionary relationships among different tRNA sequences of 21 amino acids, 22 networks are constructed. One is constructed from whole tRNAs, and the other 21 networks are constructed from the tRNAs which carry the same amino acids. A new method is proposed such that the alignment scores of any two amino acids groups are determined by the average degree and the average clustering coefficient of their networks. The anticodon feature of isolated tRNA and the phylogenetic trees of 21 group networks are discussed. We find that some isolated tRNA sequences in 21 networks still connect with other tRNAs outside their group, which reflects the fact that those tRNAs might evolve by intercrossing among these 21 groups. We also find that most anticodons among the same cluster are only one base different in the same sites when S ≥ 70, and they stay in the same rank in the ladder of evolutionary relationships. Those observations seem to agree on that some tRNAs might mutate from the same ancestor sequences based on point mutation mechanisms. 相似文献

16.

Implementation of a Markov model for phylogenetic trees

Bohl E Lancaster P 《Journal of theoretical biology》2006,239(3):324-333

A recently developed mathematical model for the analysis of phylogenetic trees is applied to comparative data for 48 species. The model represents a return to fundamentals and makes no hypothesis with respect to the reversibility of the process. The species have been analysed in all subsets of three, and a measure of reliability of the results is provided. The numerical results of the computations on 17,296 triples of species are made available on the Internet. These results are discussed and the development of reliable tree structures for several species is illustrated. It is shown that, indeed, the Markov model is capable of considerably more interesting predictions than has been recognized to date. 相似文献

17.

Bootstrap method of interior-branch test for phylogenetic trees 总被引：5，自引：2，他引：5

Sitnikova T 《Molecular biology and evolution》1996,13(4):605-611

Statistical properties of the bootstrap test of interior branch lengths of phylogenetic trees have been studied and compared with those of the standard interior-branch test in computer simulations. Examination of the properties of the tests under the null hypothesis showed that both tests for an interior branch of a predetermined topology are quite reliable when the distribution of the branch length estimate approaches a normal distribution. Unlike the standard interior-branch test, the bootstrap test appears to retain this property even when the substitution rate varies among sites. In this case, the distribution of the branch length estimate deviates from a normal distribution, and the standard interior-branch test gives conservative confidence probability values. A simple correction method was developed for both interior- branch tests to be applied for testing the reliability of tree topologies estimated from sequence data. This correction for the standard interior-branch test appears to be as effective as that obtained in our previous study, though it is much simpler. The bootstrap and standard interior-branch tests for estimated topologies become conservative as the number of sequence groups in a star-like tree increases. 相似文献

18.

An efficient algorithm for statistical multiple alignment on arbitrary phylogenetic trees.

G A Lunter I Miklós Y S Song J Hein 《Journal of computational biology》2003,10(6):869-889

We present an efficient algorithm for statistical multiple alignment based on the TKF91 model of Thorne, Kishino, and Felsenstein (1991) on an arbitrary k-leaved phylogenetic tree. The existing algorithms use a hidden Markov model approach, which requires at least O( radical 5(k)) states and leads to a time complexity of O(5(k)L(k)), where L is the geometric mean sequence length. Using a combinatorial technique reminiscent of inclusion/exclusion, we are able to sum away the states, thus improving the time complexity to O(2(k)L(k)) and considerably reducing memory requirements. This makes statistical multiple alignment under the TKF91 model a definite practical possibility in the case of a phylogenetic tree with a modest number of leaves. 相似文献

19.

Similarity of phylogenetic trees as indicator of protein-protein interaction. 总被引：7，自引：0，他引：7

F Pazos A Valencia 《Protein engineering》2001,14(9):609-614

Deciphering the network of protein interactions that underlines cellular operations has become one of the main tasks of proteomics and computational biology. Recently, a set of bioinformatics approaches has emerged for the prediction of possible interactions by combining sequence and genomic information. Even though the initial results are very promising, the current methods are still far from perfect. We propose here a new way of discovering possible protein-protein interactions based on the comparison of the evolutionary distances between the sequences of the associated protein families, an idea based on previous observations of correspondence between the phylogenetic trees of associated proteins in systems such as ligands and receptors. Here, we extend the approach to different test sets, including the statistical evaluation of their capacity to predict protein interactions. To demonstrate the possibilities of the system to perform large-scale predictions of interactions, we present the application to a collection of more than 67 000 pairs of E.coli proteins, of which 2742 are predicted to correspond to interacting proteins. 相似文献

20.

Algorithms for constructing phylogenetic trees of maximum topological similarity 总被引：1，自引：0，他引：1

S V Iushmanov K M Chumakov 《Molekuliarnaia genetika, mikrobiologiia i virusologiia》1988,(3):9-15

The paper concerns the practical realization of the maximum topologic similarity principle for phylogenetic reconstruction. This novel principle is described in the accompanying paper. Two algorithms that were embodied in the computer program allow one to find out the unique tree in case when source data admit the existence of such tree. In case if numerous parallel mutations make such precise realization impossible, algorithms allow one to obtain approximations to the maximum topologic similarity trees with a high computation efficiency. Examples illustrating use of these algorithms, as well as discussion of biological consistency of the novel concept are presented. 相似文献