首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: Numerous database management systems have been developed for processing various taxonomic data bases on biological classification or phylogenetic information. In this paper, we present an integrated system to deal with interacting classifications and phylogenies concerning particular taxonomic groups. RESULTS: An information-theoretic view (taxon view) has been applied to capture taxonomic concepts as taxonomic data entities. A data model which is suitable for supporting semantically interacting dynamic views of hierarchic classifications and a query method for interacting classifications have been developed. The concept of taxonomic view and the data model can also be expanded to carry phylogenetic information in phylogenetic trees. We have designed a prototype taxonomic database system called HICLAS (HIerarchical CLAssification System) based on the concept of taxon view, and the data models and query methods have been designed and implemented. This system can be effectively used in the taxonomic revisionary process, especially when databases are being constructed by specialists in particular groups, and the system can be used to compare classifications and phylogenetic trees. AVAILABILITY: Freely available at the WWW URL: http://aims.cps.msu.edu/hiclas/ CONTACT: pramanik@cps.msu.edu; lotus@wipm.whcnc.ac.cn  相似文献   

2.
This paper examines a new technique for the visualization of and the interaction with trees, objects frequently used to convey hierarchical relationships in biological data. Motivated by the quality of 2D tree interaction, we adapt the planar tree-of-life metaphor to a virtual, semi-immersive 3D environment. A 3D environment extends the utility of this metaphor by allowing the user to view an entire data set in a single screen. Interrogation of the tree is implemented using 3D input devices. This real-time interrogation of the tree itself provides a quick means by which to qualitatively analyse the hierarchical data. In this paper, we describe the techniques underlying the implementation of such an environment. We conclude by considering the utility of tree metaphors as a basis for the representation of highly dimensional data sets. AVAILABILITY: Arbor3D (source code, a binary executable for SGI IRIX 6.4, Perl parsers, and sample Newick data files) are available via the Internet (http://xian.tamu.edu/Arbor3D/). Arbor3D can be displayed in "CAVE simulator" mode on an SGI workstation screen, or as an interactive virtual environment on a projection workbench. CONTACT: druths@rice.edu; echen@cs.rice.edu; leland@xian.tamu.edu  相似文献   

3.
This paper poses the problem of estimating and validating phylogenetic trees in statistical terms. The problem is hard enough to warrant several tacks: we reason by analogy to rounding real numbers, and dealing with ranking data. These are both cases where, as in phylogeny the parameters of interest are not real numbers. Then we pose the problem in geometrical terms, using distances and measures on a natural space of trees. We do not solve the problems of inference on tree space, but suggest some coherent ways of tackling them.  相似文献   

4.
Collections of phylogenetic trees are usually summarized using consensus methods. These methods build a single tree, supposed to be representative of the collection. However, in the case of heterogeneous collections of trees, the resulting consensus may be poorly resolved (strict consensus, majority-rule consensus, ...), or may perform arbitrary choices among mutually incompatible clades, or splits (greedy consensus). Here, we propose an alternative method, which we call the multipolar consensus (MPC). Its aim is to display all the splits having a support above a predefined threshold, in a minimum number of consensus trees, or poles. We show that the problem is equivalent to a graph-coloring problem, and propose an implementation of the method. Finally, we apply the MPC to real data sets. Our results indicate that, typically, all the splits down to a weight of 10% can be displayed in no more than 4 trees. In addition, in some cases, biologically relevant secondary signals, which would not have been present in any of the classical consensus trees, are indeed captured by our method, indicating that the MPC provides a convenient exploratory method for phylogenetic analysis. The method was implemented in a package freely available at http://www.lirmm.fr/~cbonnard/MPC.html  相似文献   

5.
6.
Dissimilarity measures for (possibly weighted) phylogenetic trees based on the comparison of their vectors of path lengths between pairs of taxa, have been present in the systematics literature since the early seventies. For rooted phylogenetic trees, however, these vectors can only separate non-weighted binary trees, and therefore these dissimilarity measures are metrics only on this class of rooted phylogenetic trees. In this paper we overcome this problem, by splitting in a suitable way each path length between two taxa into two lengths. We prove that the resulting splitted path lengths matrices single out arbitrary rooted phylogenetic trees with nested taxa and arcs weighted in the set of positive real numbers. This allows the definition of metrics on this general class of rooted phylogenetic trees by comparing these matrices through metrics in spaces Mn(\mathbb R){\mathcal{M}_n(\mathbb {R})} of real-valued n × n matrices. We conclude this paper by establishing some basic facts about the metrics for non-weighted phylogenetic trees defined in this way using L p metrics on Mn(\mathbb R){\mathcal{M}_n(\mathbb {R})}, with ${p \in \mathbb {R}_{ >0 }}${p \in \mathbb {R}_{ >0 }}.  相似文献   

7.
We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.  相似文献   

8.
Taxonomic names and phylogenetic trees   总被引:2,自引:0,他引:2  
This paper addresses the issue of philosophy of names within the context of biological taxonomy, more specifically how names refer. By contrasting two philosophies of names, one that is based on the idea that names can be defined and one that they cannot be defined, I point out some advantages of the latter within phylogenetic systematics. Due to the changing nature of phylogenetic hypotheses, the former approach tends to rob taxonomy from its unique communicative value since a name that is defined refers to whatever fits the definition. This is particularly troublesome should the hypothesis of phylogenetic relationship change. I argue that, should we decide to accept a new phylogenetic hypothesis, it is also likely that our view of what to name may change. A system where names only refer acknowledge this, and accordingly leaves it open whether to keep a name (and accept the way it refers in the new hypothesis) or discard a name and introduce new names for the parts of the tree that we find scientifically interesting. One of the main differences between a phylogenetic system of definition (PSD) and a phylogenetic system of reference (PSR) is that the former is governed by laws of language while the latter by communicative needs of taxonomists. Thus, a PSR tends to give primacy to phylogenetic trees rather than phylogenetic definitions of names should our views of which phylogenetic hypothesis to accept change. © 1998 The Norwegian Academy of Sciences and Letters  相似文献   

9.
MOTIVATION: Algorithms for phylogenetic tree reconstruction based on gene order data typically repeatedly solve instances of the reversal median problem (RMP) which is to find for three given gene orders a fourth gene order (called median) with a minimal sum of reversal distances. All existing algorithms of this type consider only one median for each RMP instance even when a large number of medians exist. A careful selection of one of the medians might lead to better phylogenetic trees. RESULTS: We propose a heuristic algorithm amGRP for solving the multiple genome rearrangement problem (MGRP) by repeatedly solving instances of the RMP taking all medians into account. Algorithm amGRP uses a branch-and-bound method that branches over medians from a selected subset of all medians for each RMP instance. Different heuristics for selecting the subsets have been investigated. To show that the medians for RMP vary strongly with respect to different properties that are likely to be relevant for phylogenetic tree reconstruction, the set of all medians has been investigated for artificial datasets and mitochondrial DNA (mtDNA) gene orders. Phylogenetic trees have been computed for a large set of randomly generated gene orders and two sets of mtDNA gene order data for different animal taxa with amGRP and with two standard approaches for solving the MGRP (GRAPPA-DCM and MGR). The results show that amGRP outperforms both other methods with respect to solution quality and computation time on the test data. AVAILABILITY: The source code of amGRP, additional results and the test instances used in this paper are freely available from the authors.  相似文献   

10.
Several indices that measure the degree of balance of a rooted phylogenetic tree have been proposed so far in the literature. In this work we define and study a new index of this kind, which we call the total cophenetic index: the sum, over all pairs of different leaves, of the depth of their lowest common ancestor. This index makes sense for arbitrary trees, can be computed in linear time and it has a larger range of values and a greater resolution power than other indices like Colless’ or Sackin’s. We compute its maximum and minimum values for arbitrary and binary trees, as well as exact formulas for its expected value for binary trees under the Yule and the uniform models of evolution. As a byproduct of this study, we obtain an exact formula for the expected value of the Sackin index under the uniform model, a result that seems to be new in the literature.  相似文献   

11.
Studying the shape of phylogenetic trees under different random models is an important issue in evolutionary biology. In this paper, we propose a general framework for deriving detailed statistical results for patterns in phylogenetic trees under the Yule–Harding model and the uniform model, two of the most fundamental random models considered in phylogenetics. Our framework will unify several recent studies which were mainly concerned with the mean value and the variance. Moreover, refined statistical results such as central limit theorems, Berry–Esseen bounds, local limit theorems, etc., are obtainable with our approach as well. A key contribution of the current study is that our results are applicable to the whole range of possible sizes of the pattern.  相似文献   

12.
The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made.  相似文献   

13.
Much modern work in phylogenetics depends on statistical sampling approaches to phylogeny construction to estimate probability distributions of possible trees for any given input data set. Our theoretical understanding of sampling approaches to phylogenetics remains far less developed than that for optimization approaches, however, particularly with regard to the number of sampling steps needed to produce accurate samples of tree partition functions. Despite the many advantages in principle of being able to sample trees from sophisticated probabilistic models, we have little theoretical basis for concluding that the prevailing sampling approaches do in fact yield accurate samples from those models within realistic numbers of steps. We propose a novel approach to phylogenetic sampling intended to be both efficient in practice and more amenable to theoretical analysis than the prevailing methods. The method depends on replacing the standard tree rearrangement moves with an alternative Markov model in which one solves a theoretically hard but practically tractable optimization problem on each step of sampling. The resulting method can be applied to a broad range of standard probability models, yielding practical algorithms for efficient sampling and rigorous proofs of accurate sampling for heated versions of some important special cases. We demonstrate the efficiency and versatility of the method by an analysis of uncertainty in tree inference over varying input sizes. In addition to providing a new practical method for phylogenetic sampling, the technique is likely to prove applicable to many similar problems involving sampling over combinatorial objects weighted by a likelihood model.  相似文献   

14.
Summary: The Summary Tree Explorer (STE) is a Java applicationfor interactively exploring sets of phylogenetic trees usingtwo coupled representations: a node-and-link diagram and a textuallist of common clades. Selection, pruning, filtering or re-rootingin one representation is immediately reflected in the other.While summary trees are more effective at showing the relationshipamong clades, they can only show a consistent subset of thosethat appear in the textual list. Working with both representationsmitigates the disadvantages of having to choose just one. Availability: STE, along with several sample datasets, is availableat http://cityscape.inf.cs.cmu.edu/phylogeny/ Contact: mad{at}cs.cmu.edu Associate Editor: Martin Bishop  相似文献   

15.
16.
The most widely used evolutionary model for phylogenetic trees is the equal-rates Markov (ERM) model. A problem is that the ERM model predicts less imbalance than observed for trees inferred from real data; in fact, the observed imbalance tends to fall between the values predicted by the ERM model and those predicted by the proportional-to-distinguishable-arrangements (PDA) model. Here, a continuous multi-rate (MR) family of evolutionary models is presented which contains entire subfamilies corresponding to both the PDA and ERM models. Furthermore, this MR family covers an entire range from 'completely balanced' to 'completely unbalanced' models. In particular, the MR family contains other known evolutionary models. The MR family is very versatile and virtually free of assumptions on the character of evolution; yet it is highly susceptible to rigorous analyses. In particular, such analyses help to uncover adaptability, quasi-stabilization and prolonged stasis as major possible causes of the imbalance. However, the MR model is functionally simple and requires only three parameters to reproduce the observed imbalance.  相似文献   

17.
A recently developed mathematical model for the analysis of phylogenetic trees is applied to comparative data for 48 species. The model represents a return to fundamentals and makes no hypothesis with respect to the reversibility of the process. The species have been analysed in all subsets of three, and a measure of reliability of the results is provided. The numerical results of the computations on 17,296 triples of species are made available on the Internet. These results are discussed and the development of reliable tree structures for several species is illustrated. It is shown that, indeed, the Markov model is capable of considerably more interesting predictions than has been recognized to date.  相似文献   

18.
Bootstrap method of interior-branch test for phylogenetic trees   总被引:5,自引:2,他引:5  
Statistical properties of the bootstrap test of interior branch lengths of phylogenetic trees have been studied and compared with those of the standard interior-branch test in computer simulations. Examination of the properties of the tests under the null hypothesis showed that both tests for an interior branch of a predetermined topology are quite reliable when the distribution of the branch length estimate approaches a normal distribution. Unlike the standard interior-branch test, the bootstrap test appears to retain this property even when the substitution rate varies among sites. In this case, the distribution of the branch length estimate deviates from a normal distribution, and the standard interior-branch test gives conservative confidence probability values. A simple correction method was developed for both interior- branch tests to be applied for testing the reliability of tree topologies estimated from sequence data. This correction for the standard interior-branch test appears to be as effective as that obtained in our previous study, though it is much simpler. The bootstrap and standard interior-branch tests for estimated topologies become conservative as the number of sequence groups in a star-like tree increases.   相似文献   

19.
We present a dimensionless fit index for phylogenetic trees that have been constructed from distance matrices. It is designed to measure the quality of the fit of the data to a tree in absolute terms, independent of linear transformations on the distance matrix. The index can be used as an absolute measure to evaluate how well a set of data fits to a tree, or as a relative measure to compare different methods that are expected to produce the same tree. The usefulness of the index is demonstrated in three examples.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号