共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Richard Holmquist 《Journal of molecular evolution》1978,12(1):17-24
Summary The augmentation procedure of G.W. Moore leads to correct estimates of the total number of nucleotide substitutions separating two genes descendent from a common ancestor provided the data base is sufficiently dense. These estimates are in agreement with the true distance values from simulations of known evolutionary pathways. The estimates, on the average, are unbiased: they neither overaugment nor underaugment seriously. The variance of the population of augmented distance values reflects accurately the variance of the population of true distance values and is thus not abnormally large due to procedural defects in the algorithm.The augmented distances are in agreement with stochastic models tested on real data when the latter take proper account of the restricted mutability of codons resulting from natural selection.When the experimental data base is not dense, the augmented distance values and population variance may underestimate both the true distance values and their variance. This has a logical consequence that there exist significant and numerous errors in the ancestral sequences reconstructed by the parsimony principle from such data bases.The restrictions, resulting from natural selection, on the mutability of different nucleotide sites is shown to bear critically on the accuracy of estimates of the total number of nucleotide replacements made by stochastic models. 相似文献
3.
4.
SUMMARY: CTree has been designed for the quantification of clusters within viral phylogenetic tree topologies. Clusters are stored as individual data structures from which statistical data, such as the Subtype Diversity Ratio (SDR), Subtype Diversity Variance (SDV) and pairwise distances can be extracted. This simplifies the quantification of tree topologies in relation to inter- and intra-cluster diversity. Here the novel features incorporated within CTree, including the implementation of a heuristic algorithm for identifying clusters, are outlined along with the more usual features found within general tree viewing software. AVAILABILITY: CTree is available as an executable jar file from: http://www.manchester.ac.uk/bioinformatics/ctree 相似文献
5.
SUMMARY: BAOBAB is a Java user interface dedicated to viewing and editing large phylogenetic trees. Original features include: (i) a colour-mediated overview of magnified subtrees; (ii) copy/cut/paste of (sub)trees within or between windows; (iii) compressing/ uncompressing subtrees; and (iv) managing sequence files together with tree files. AVAILABILITY: http://www.univ-montp2.fr/~genetix/. 相似文献
6.
Holmes S 《Theoretical population biology》2003,63(1):17-32
This paper poses the problem of estimating and validating phylogenetic trees in statistical terms. The problem is hard enough to warrant several tacks: we reason by analogy to rounding real numbers, and dealing with ranking data. These are both cases where, as in phylogeny the parameters of interest are not real numbers. Then we pose the problem in geometrical terms, using distances and measures on a natural space of trees. We do not solve the problems of inference on tree space, but suggest some coherent ways of tackling them. 相似文献
7.
8.
A recently developed mathematical model for the analysis of phylogenetic trees is applied to comparative data for 48 species. The model represents a return to fundamentals and makes no hypothesis with respect to the reversibility of the process. The species have been analysed in all subsets of three, and a measure of reliability of the results is provided. The numerical results of the computations on 17,296 triples of species are made available on the Internet. These results are discussed and the development of reliable tree structures for several species is illustrated. It is shown that, indeed, the Markov model is capable of considerably more interesting predictions than has been recognized to date. 相似文献
9.
SUMMARY: We describe an algorithm and software tool for comparing alternative phylogenetic trees. The main application of the software is to compare phylogenies obtained using different phylogenetic methods for some fixed set of species or obtained using different gene sequences from those species. The algorithm pairs up each branch in one phylogeny with a matching branch in the second phylogeny and finds the optimum 1-to-1 map between branches in the two trees in terms of a topological score. The software enables the user to explore the corresponding mapping between the phylogenies interactively, and clearly highlights those parts of the trees that differ, both in terms of topology and branch length. AVAILABILITY: The software is implemented as a Java applet at http://www.mrc-bsu.cam.ac.uk/personal/thomas/phylo_comparison/comparison_page.html. It is also available on request from the authors. 相似文献
10.
We present an efficient algorithm for statistical multiple alignment based on the TKF91 model of Thorne, Kishino, and Felsenstein (1991) on an arbitrary k-leaved phylogenetic tree. The existing algorithms use a hidden Markov model approach, which requires at least O( radical 5(k)) states and leads to a time complexity of O(5(k)L(k)), where L is the geometric mean sequence length. Using a combinatorial technique reminiscent of inclusion/exclusion, we are able to sum away the states, thus improving the time complexity to O(2(k)L(k)) and considerably reducing memory requirements. This makes statistical multiple alignment under the TKF91 model a definite practical possibility in the case of a phylogenetic tree with a modest number of leaves. 相似文献
11.
Collections of phylogenetic trees are usually summarized using consensus methods. These methods build a single tree, supposed to be representative of the collection. However, in the case of heterogeneous collections of trees, the resulting consensus may be poorly resolved (strict consensus, majority-rule consensus, ...), or may perform arbitrary choices among mutually incompatible clades, or splits (greedy consensus). Here, we propose an alternative method, which we call the multipolar consensus (MPC). Its aim is to display all the splits having a support above a predefined threshold, in a minimum number of consensus trees, or poles. We show that the problem is equivalent to a graph-coloring problem, and propose an implementation of the method. Finally, we apply the MPC to real data sets. Our results indicate that, typically, all the splits down to a weight of 10% can be displayed in no more than 4 trees. In addition, in some cases, biologically relevant secondary signals, which would not have been present in any of the classical consensus trees, are indeed captured by our method, indicating that the MPC provides a convenient exploratory method for phylogenetic analysis. The method was implemented in a package freely available at http://www.lirmm.fr/~cbonnard/MPC.html 相似文献
12.
Vladimir Makarenkov Alix Boc Jingxin Xie Pedro Peres-Neto François-Joseph Lapointe Pierre Legendre 《BMC evolutionary biology》2010,10(1):250
Background
Non-parametric bootstrapping is a widely-used statistical procedure for assessing confidence of model parameters based on the empirical distribution of the observed data [1] and, as such, it has become a common method for assessing tree confidence in phylogenetics [2]. Traditional non-parametric bootstrapping does not weigh each tree inferred from resampled (i.e., pseudo-replicated) sequences. Hence, the quality of these trees is not taken into account when computing bootstrap scores associated with the clades of the original phylogeny. As a consequence, traditionally, the trees with different bootstrap support or those providing a different fit to the corresponding pseudo-replicated sequences (the fit quality can be expressed through the LS, ML or parsimony score) contribute in the same way to the computation of the bootstrap support of the original phylogeny. 相似文献13.
With the huge increase of protein data, an important problem is to estimate, within a large protein family, the number of sensible subsets for subsequent in-depth structural, functional, and evolutionary analyses. To tackle this problem, we developed a new program, Secator, which implements the principle of an ascending hierarchical method using a distance matrix based on a multiple alignment of protein sequences. Dissimilarity values assigned to the nodes of a deduced phylogenetic tree are partitioned by a new stopping rule introduced to automatically determine the significant dissimilarity values. The quality of the clusters obtained by Secator is verified by a separate Jackknife study. The method is demonstrated on 24 large protein families covering a wide spectrum of structural and sequence conservation and its usefulness and accuracy with real biological data is illustrated on two well-studied protein families (the Sm proteins and the nuclear receptors). 相似文献
14.
We develop a new method for testing a portion of a tree (called a clade) based on multiple tests of many 4-taxon trees in this paper. This is particularly useful when the phylogenetic tree constructed by other methods have a clade that is difficult to explain from a biological point of view. The statement about the test of the clade can be made through the multiple P values from these individual tests. By controlling the familywise error rate or the false discovery rate (FDR), 4 different tree test methods are evaluated through simulation methods. It shows that the combination of the approximately unbiased (AU) test and the FDR-controlling procedure provides strong power along with reasonable type I error rate and less heavy computation. 相似文献
15.
Background
Automated methods for assembling families of orthologous genes include those based on sequence similarity scores and those based on phylogenetic approaches. The first are easy to automate but usually they do not distinguish between paralogs and orthologs or have restriction on the number of taxa. Phylogenetic methods often are based on reconciliation of a gene tree with a known rooted species tree; a limitation of this approach, especially in case of prokaryotes, is that the species tree is often unknown, and that from the analyses of single gene families the branching order between related organisms frequently is unresolved. 相似文献16.
Francesco Cerutti Luigi Bertolotti Tony L Goldberg Mario Giacobini 《BMC bioinformatics》2011,12(1):58
Background
Phylogenetic trees are an important tool for representing evolutionary relationships among organisms. In a phylogram or chronogram, the ordering of taxa is not considered meaningful, since complete topological information is given by the branching order and length of the branches, which are represented in the root-to-node direction. We apply a novel method based on a (λ + μ)-Evolutionary Algorithm to give meaning to the order of taxa in a phylogeny. This method applies random swaps between two taxa connected to the same node, without changing the topology of the tree. The evaluation of a new tree is based on different distance matrices, representing non-phylogenetic information such as other types of genetic distance, geographic distance, or combinations of these. To test our method we use published trees of Vesicular stomatitis virus, West Nile virus and Rice yellow mottle virus. 相似文献17.
MOTIVATION: Despite substantial efforts to develop and populate the back-ends of biological databases, front-ends to these systems often rely on taxonomic expertise. This research applies techniques from human-computer interaction research to the biodiversity domain. RESULTS: We developed an interactive node-link tool, TaxonTree, illustrating the value of a carefully designed interaction model, animation, and integrated searching and browsing towards retrieval of biological names and other information. Users tested the tool using a new, large integrated dataset of animal names with phylogenetic-based and classification-based tree structures. These techniques also translated well for a tool, DoubleTree, to allow comparison of trees using coupled interaction. Our approaches will be useful not only for biological data but as general portal interfaces. 相似文献
18.
Liang Liu Lili Yu LauraKubatko Dennis K. Pearl Scott V. Edwards 《Molecular phylogenetics and evolution》2009,53(1):320-328
We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces. 相似文献
19.
Gabriel Cardona Mercè Llabrés Francesc Rosselló Gabriel Valiente 《Journal of mathematical biology》2010,61(2):253-276
Dissimilarity measures for (possibly weighted) phylogenetic trees based on the comparison of their vectors of path lengths
between pairs of taxa, have been present in the systematics literature since the early seventies. For rooted phylogenetic
trees, however, these vectors can only separate non-weighted binary trees, and therefore these dissimilarity measures are
metrics only on this class of rooted phylogenetic trees. In this paper we overcome this problem, by splitting in a suitable
way each path length between two taxa into two lengths. We prove that the resulting splitted path lengths matrices single out arbitrary rooted phylogenetic trees with nested taxa and arcs weighted in the set of positive real numbers. This
allows the definition of metrics on this general class of rooted phylogenetic trees by comparing these matrices through metrics
in spaces
Mn(\mathbb R){\mathcal{M}_n(\mathbb {R})} of real-valued n × n matrices. We conclude this paper by establishing some basic facts about the metrics for non-weighted phylogenetic trees defined
in this way using L
p
metrics on
Mn(\mathbb R){\mathcal{M}_n(\mathbb {R})}, with ${p \in \mathbb {R}_{ >0 }}${p \in \mathbb {R}_{ >0 }}. 相似文献
20.
Rooted phylogenetic trees constructed from different datasets (e.g. from different genes) are often conflicting with one another, i.e. they cannot be integrated into a single phylogenetic tree. Phylogenetic networks have become an important tool in molecular evolution, and rooted phylogenetic networks are able to represent conflicting rooted phylogenetic trees. Hence, the development of appropriate methods to compute rooted phylogenetic networks from rooted phylogenetic trees has attracted considerable research interest of late. The CASS algorithm proposed by van Iersel et al. is able to construct much simpler networks than other available methods, but it is extremely slow, and the networks it constructs are dependent on the order of the input data. Here, we introduce an improved CASS algorithm, BIMLR. We show that BIMLR is faster than CASS and less dependent on the input data order. Moreover, BIMLR is able to construct much simpler networks than almost all other methods. BIMLR is available at http://nclab.hit.edu.cn/wangjuan/BIMLR/. 相似文献