首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
A new method, PATHd8, for estimating ultrametric trees from trees with edge (branch) lengths proportional to the number of substitutions is proposed. The method allows for an arbitrary number of reference nodes for time calibration, each defined either as absolute age, minimum age, or maximum age, and the tree need not be fully resolved. The method is based on estimating node ages by mean path lengths from the node to the leaves but correcting for deviations from a molecular clock suggested by reference nodes. As opposed to most existing methods allowing substitution rate variation, the new method smoothes substitution rates locally, rather than simultaneously over the whole tree, thus allowing for analysis of very large trees. The performance of PATHd8 is compared with other frequently used methods for estimating divergence times. In analyses of three separate data sets, PATHd8 gives similar divergence times to other methods, the largest difference being between crown group ages, where unconstrained nodes get younger ages when analyzed with PATHd8. Overall, chronograms obtained from other methods appear smoother, whereas PATHd8 preserves more of the heterogeneity seen in the original edge lengths. Divergence times are most evenly spread over the chronograms obtained from the Bayesian implementation and the clock-based Langley-Fitch method, and these two methods produce very similar ages for most nodes. Evaluations of PATHd8 using simulated data suggest that PATHd8 is slightly less precise compared with penalized likelihood, but it gives more sensible answers for extreme data sets. A clear advantage with PATHd8 is that it is more or less instantaneous even with trees having several thousand leaves, whereas other programs often run into problems when analyzing trees with hundreds of leaves. PATHd8 is implemented in freely available software.  相似文献   

2.
Visualizing large hierarchical clusters in hyperbolic space   总被引:9,自引:0,他引:9  
SUMMARY: HyperTree is an application to visualize and navigate large trees in hyperbolic space. It includes color-coding, search mechanisms and navigational aids, as well as focus+context viewing, allowing enormous trees to fit within the fixed space of a computer screen or printed page.  相似文献   

3.
The challenge of constructing large phylogenetic trees   总被引:3,自引:0,他引:3  
The amount of sequence data available to reconstruct the evolutionary history of genes and species has increased 20-fold in the past decade. Consequently the size of phylogenetic analyses has grown as well, and phylogenetic methods, algorithms and their implementations have struggled to keep pace. Computational and other challenges raised by this burgeoning database emerge at several stages of analysis, from the optimal assembly of large data matrices from sequence databases, to the efficient construction of trees from these large matrices and the piece-wise assembly of 'supertrees' from those trees in turn. A final challenge is posed by the difficulty of visualizing and making inferences from trees that might soon routinely contain thousands of species.  相似文献   

4.
SUMMARY: BAOBAB is a Java user interface dedicated to viewing and editing large phylogenetic trees. Original features include: (i) a colour-mediated overview of magnified subtrees; (ii) copy/cut/paste of (sub)trees within or between windows; (iii) compressing/ uncompressing subtrees; and (iv) managing sequence files together with tree files. AVAILABILITY: http://www.univ-montp2.fr/~genetix/.  相似文献   

5.
6.
How will the emerging possibility of inferring ultra-large phylogenies influence our ability to identify shifts in diversification rate? For several large angiosperm clades (Angiospermae, Monocotyledonae, Orchidaceae, Poaceae, Eudicotyledonae, Fabaceae, and Asteraceae), we explore this issue by contrasting two approaches: (1) using small backbone trees with an inferred number of extant species assigned to each terminal clade and (2) using a mega-phylogeny of 55473 seed plant species represented in GenBank. The mega-phylogeny approach assumes that the sample of species in GenBank is at least roughly proportional to the actual species diversity of different lineages, as appears to be the case for many major angiosperm lineages. Using both approaches, we found that diversification rate shifts are not directly associated with the major named clades examined here, with the sole exception of Fabaceae in the GenBank mega-phylogeny. These agreements are encouraging and may support a generality about angiosperm evolution: major shifts in diversification may not be directly associated with major named clades, but rather with clades that are nested not far within these groups. An alternative explanation is that there have been increased extinction rates in early-diverging lineages within these clades. Based on our mega-phylogeny, the shifts in diversification appear to be distributed quite evenly throughout the angiosperms. Mega-phylogenetic studies of diversification hold great promise for revealing new patterns, but we will need to focus more attention on properly specifying null expectation.  相似文献   

7.

Background  

Research in evolution requires software for visualizing and editing phylogenetic trees, for increasingly very large datasets, such as arise in expression analysis or metagenomics, for example. It would be desirable to have a program that provides these services in an effcient and user-friendly way, and that can be easily installed and run on all major operating systems. Although a large number of tree visualization tools are freely available, some as a part of more comprehensive analysis packages, all have drawbacks in one or more domains. They either lack some of the standard tree visualization techniques or basic graphics and editing features, or they are restricted to small trees containing only tens of thousands of taxa. Moreover, many programs are diffcult to install or are not available for all common operating systems.  相似文献   

8.
9.
10.
GeoPhylo is a scalable online service for developing 3‐dimensional geographic visualizations of phylogenetic trees in the keyhole markup language (KML). These geographic phylogenies, geophylogenies, can then be viewed in Google Earth or Nasa's World Wind. Advanced features provide users the ability to change many aspects such as scaling and coloring of branches. The GeoPhylo engine has been deployed on the Google App Engine in order to be scalable, sustainable and easily updated, while providing long‐term support for stable releases. These features will allow developers to use GeoPhylo as a service in their own applications without concerns of incompatible changes made in future updates.  相似文献   

11.
Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of those sequences that maximize likelihood under the Jukes-Cantor model is uninformative in the worst possible sense. For all inputs, all trees optimize the likelihood score. Second, we show that a greedy heuristic that uses GTR+Gamma ML to optimize the alignment and the tree can produce very poor alignments and trees. Therefore, the excellent performance of SATé-II and SATé-I is not because ML is used as an optimization criterion for choosing the best tree/alignment pair but rather due to the particular divide-and-conquer realignment techniques employed.  相似文献   

12.
We develop a new method for testing a portion of a tree (called a clade) based on multiple tests of many 4-taxon trees in this paper. This is particularly useful when the phylogenetic tree constructed by other methods have a clade that is difficult to explain from a biological point of view. The statement about the test of the clade can be made through the multiple P values from these individual tests. By controlling the familywise error rate or the false discovery rate (FDR), 4 different tree test methods are evaluated through simulation methods. It shows that the combination of the approximately unbiased (AU) test and the FDR-controlling procedure provides strong power along with reasonable type I error rate and less heavy computation.  相似文献   

13.
MOTIVATION: The computation of large phylogenetic trees with statistical models such as maximum likelihood or bayesian inference is computationally extremely intensive. It has repeatedly been demonstrated that these models are able to recover the true tree or a tree which is topologically closer to the true tree more frequently than less elaborate methods such as parsimony or neighbor joining. Due to the combinatorial and computational complexity the size of trees which can be computed on a Biologist's PC workstation within reasonable time is limited to trees containing approximately 100 taxa. RESULTS: In this paper we present the latest release of our program RAxML-III for rapid maximum likelihood-based inference of large evolutionary trees which allows for computation of 1.000-taxon trees in less than 24 hours on a single PC processor. We compare RAxML-III to the currently fastest implementations for maximum likelihood and bayesian inference: PHYML and MrBayes. Whereas RAxML-III performs worse than PHYML and MrBayes on synthetic data it clearly outperforms both programs on all real data alignments used in terms of speed and final likelihood values. Availability SUPPLEMENTARY INFORMATION: RAxML-III including all alignments and final trees mentioned in this paper is freely available as open source code at http://wwwbode.cs.tum/~stamatak CONTACT: stamatak@cs.tum.edu.  相似文献   

14.
A recent large-scale phylogenomic study has shown the great degree of topological variation that can be found among eukaryotic phylogenetic trees constructed from single genes, highlighting the problems that can be associated with gene sampling in phylogenetic studies.  相似文献   

15.

Background

The rate of evolution varies spatially along genomes and temporally in time. The presence of evolutionary rate variation is an informative signal that often marks functional regions of genomes and historical selection events. There exist many tests for temporal rate variation, or heterotachy, that start by partitioning sampled sequences into two or more groups and testing rate homogeneity among the groups. I develop a Bayesian method to infer phylogenetic trees with a divergence point, or dramatic temporal shifts in selection pressure that affect many nucleotide sites simultaneously, located at an unknown position in the tree.

Results

Simulation demonstrates that the method is most able to detect divergence points when rate variation and the number of affected sites is high, but not beyond biologically relevant values. The method is applied to two viral data sets. A divergence point is identified separating the B and C subtypes, two genetically distinct variants of HIV that have spread into different human populations with the AIDS epidemic. In contrast, no strong signal of temporal rate variation is found in a sample of F and H genotypes, two genetic variants of HBV that have likely evolved with humans during their immigration and expansion into the Americas.

Conclusion

Temporal shifts in evolutionary rate of sufficient magnitude are detectable in the history of sampled sequences. The ability to detect such divergence points without the need to specify a prior hypothesis about the location or timing of the divergence point should help scientists identify historically important selection events and decipher mechanisms of evolution.
  相似文献   

16.
Studying the shape of phylogenetic trees under different random models is an important issue in evolutionary biology. In this paper, we propose a general framework for deriving detailed statistical results for patterns in phylogenetic trees under the Yule–Harding model and the uniform model, two of the most fundamental random models considered in phylogenetics. Our framework will unify several recent studies which were mainly concerned with the mean value and the variance. Moreover, refined statistical results such as central limit theorems, Berry–Esseen bounds, local limit theorems, etc., are obtainable with our approach as well. A key contribution of the current study is that our results are applicable to the whole range of possible sizes of the pattern.  相似文献   

17.

Background  

Accurate taxonomy is best maintained if species are arranged as hierarchical groups in phylogenetic trees. This is especially important as trees grow larger as a consequence of a rapidly expanding sequence database. Hierarchical group names are typically manually assigned in trees, an approach that becomes unfeasible for very large topologies.  相似文献   

18.
SUMMARY: ProfDist is a user-friendly software package using the profile-neighbor-joining method (PNJ) in inferring phylogenies based on profile distances on DNA or RNA sequences. It is a tool for reconstructing and visualizing large phylogenetic trees providing new and standard features with a special focus on time efficency, robustness and accuracy. AVAILABILITY: A Windows version of ProfDist comes with a graphical user interface and is freely available at http://profdist.bioapps.biozentrum.uni-wuerzburg.de  相似文献   

19.
Few issues in evolutionary biology have received as much attention over the years or have generated as much controversy as those involving evolutionary rates. One unresolved issue is whether or not shifts in speclation and/or extinction rates are closely tied to the origin of 'key' innovations in evolution. This discussion has long been dominated by 'time-based' methods using data from the fossil record. Recently, however, attention has shifted to 'tree-based' methods, in which time, if It plays any role at all, is incorporated secondarily, usually based on molecular data. Tests of hypotheses about key innovations do require Information about phylogenetic relationships, and some of these tests can be implemented without any information about time. However, every effort should be made to obtain information about time, which greatly increases the power of such tests.  相似文献   

20.
One of the main problems in phylogenetics is to develop systematic methods for constructing evolutionary or phylogenetic trees. For a set of species X, an edge-weighted phylogenetic X-tree or phylogenetic tree is a (graph theoretical) tree with leaf set X and no degree 2 vertices, together with a map assigning a non-negative length to each edge of the tree. Within phylogenetics, several methods have been proposed for constructing such trees that work by trying to piece together quartet trees on X, i.e. phylogenetic trees each having four leaves in X. Hence, it is of interest to characterise when a collection of quartet trees corresponds to a (unique) phylogenetic tree. Recently, Dress and Erdös provided such a characterisation for binary phylogenetic trees, that is, phylogenetic trees all of whose internal vertices have degree 3. Here we provide a new characterisation for arbitrary phylogenetic trees.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号