首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
BEAST 2: A Software Platform for Bayesian Evolutionary Analysis   总被引:1,自引:0,他引:1  
We present a new open source, extensible and flexible software platform for Bayesian evolutionary analysis called BEAST 2. This software platform is a re-design of the popular BEAST 1 platform to correct structural deficiencies that became evident as the BEAST 1 software evolved. Key among those deficiencies was the lack of post-deployment extensibility. BEAST 2 now has a fully developed package management system that allows third party developers to write additional functionality that can be directly installed to the BEAST 2 analysis platform via a package manager without requiring a new software release of the platform. This package architecture is showcased with a number of recently published new models encompassing birth-death-sampling tree priors, phylodynamics and model averaging for substitution models and site partitioning. A second major improvement is the ability to read/write the entire state of the MCMC chain to/from disk allowing it to be easily shared between multiple instances of the BEAST software. This facilitates checkpointing and better support for multi-processor and high-end computing extensions. Finally, the functionality in new packages can be easily added to the user interface (BEAUti 2) by a simple XML template-based mechanism because BEAST 2 has been re-designed to provide greater integration between the analysis engine and the user interface so that, for example BEAST and BEAUti use exactly the same XML file format.
This is a PLOS Computational Biology Software Article.
  相似文献   

2.
Bayesian phylogenetics with BEAUti and the BEAST 1.7   总被引:7,自引:0,他引:7  
Computational evolutionary biology, statistical phylogenetics and coalescent-based population genetics are becoming increasingly central to the analysis and understanding of molecular sequence data. We present the Bayesian Evolutionary Analysis by Sampling Trees (BEAST) software package version 1.7, which implements a family of Markov chain Monte Carlo (MCMC) algorithms for Bayesian phylogenetic inference, divergence time dating, coalescent analysis, phylogeography and related molecular evolutionary analyses. This package includes an enhanced graphical user interface program called Bayesian Evolutionary Analysis Utility (BEAUti) that enables access to advanced models for molecular sequence and phenotypic trait evolution that were previously available to developers only. The package also provides new tools for visualizing and summarizing multispecies coalescent and phylogeographic analyses. BEAUti and BEAST 1.7 are open source under the GNU lesser general public license and available at http://beast-mcmc.googlecode.com and http://beast.bio.ed.ac.uk.  相似文献   

3.
MOTIVATION: Bayesian analysis is one of the most popular methods in phylogenetic inference. The most commonly used methods fix a single multiple alignment and consider only substitutions as phylogenetically informative mutations, though alignments and phylogenies should be inferred jointly as insertions and deletions also carry informative signals. Methods addressing these issues have been developed only recently and there has not been so far a user-friendly program with a graphical interface that implements these methods. RESULTS: We have developed an extendable software package in the Java programming language that samples from the joint posterior distribution of phylogenies, alignments and evolutionary parameters by applying the Markov chain Monte Carlo method. The package also offers tools for efficient on-the-fly summarization of the results. It has a graphical interface to configure, start and supervise the analysis, to track the status of the Markov chain and to save the results. The background model for insertions and deletions can be combined with any substitution model. It is easy to add new substitution models to the software package as plugins. The samples from the Markov chain can be summarized in several ways, and new postprocessing plugins may also be installed.  相似文献   

4.
To refine the location of a disease gene within the bounds provided by linkage analysis, many scientists use the pattern of linkage disequilibrium between the disease allele and alleles at nearby markers. We describe a method that seeks to refine location by analysis of "disease" and "normal" haplotypes, thereby using multivariate information about linkage disequilibrium. Under the assumption that the disease mutation occurs in a specific gap between adjacent markers, the method first combines parsimony and likelihood to build an evolutionary tree of disease haplotypes, with each node (haplotype) separated, by a single mutational or recombinational step, from its parent. If required, latent nodes (unobserved haplotypes) are incorporated to complete the tree. Once the tree is built, its likelihood is computed from probabilities of mutation and recombination. When each gap between adjacent markers is evaluated in this fashion and these results are combined with prior information, they yield a posterior probability distribution to guide the search for the disease mutation. We show, by evolutionary simulations, that an implementation of these methods, called "FineMap," yields substantial refinement and excellent coverage for the true location of the disease mutation. Moreover, by analysis of hereditary hemochromatosis haplotypes, we show that FineMap can be robust to genetic heterogeneity.  相似文献   

5.
6.
MRBAYES: Bayesian inference of phylogenetic trees   总被引:108,自引:0,他引:108  
SUMMARY: The program MRBAYES performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo. AVAILABILITY: MRBAYES, including the source code, documentation, sample data files, and an executable, is available at http://brahms.biology.rochester.edu/software.html.  相似文献   

7.
SUMMARY: QDist is a program for computing the quartet distance between two unrooted trees, i.e. the number of quartet topology differences between the trees, where a quartet topology is the topological subtree induced by four species. The program is based on an algorithm with running time O(n log2 n), which makes it practical to compare large trees. Available under GNU license. AVAILABILITY: http://www.birc.dk/Software/QDist  相似文献   

8.
9.
A Bayesian analysis of multiple-recapture sampling for a closed population   总被引:3,自引:0,他引:3  
CASTLEDINE  B. J. 《Biometrika》1981,68(1):197-210
  相似文献   

10.
11.
We have sequenced four new mitochondrial genomes to improve the stability of the tree for placental mammals; they are two insectivores (a gymnure, Echinosorex gymnurus and Formosan shrew Soriculus fumidus); a Formosan lesser horseshoe bat (Rhinolophus monoceros); and the New Zealand fur seal (Arctocephalus forsteri). A revision to the hedgehog sequence (Erinaceus europaeus) is also reported. All five are from the Laurasiatheria grouping of eutherian mammals. On this new data set there is a strong tendency for the hedgehog and its relative, the gymnure, to join with the other Laurasiatherian insectivores (mole and shrews). To quantify the stability of trees from this data we define, based on nuclear sequences, a major four-way split in Laurasiatherians. This ([Xenarthra, Afrotheria], [Laurasiatheria, Supraprimates]) split is also found from mitochondrial genomes using either protein-coding or RNA (rRNA and tRNA) data sets. The high similarity of the mitochondrial and nuclear-derived trees allows a quantitative estimate of the stability of trees from independent data sets, as detected from a triplet Markov analysis. There are significant changes in the mutational processes within placental mammals that are ignored by current tree programs. On the basis of our quantitative results, we expect the evolutionary tree for mammals to be resolved quickly, and this will allow other problems to be solved.  相似文献   

12.
MOTIVATION: Maximum likelihood (ML) is an increasingly popular optimality criterion for selecting evolutionary trees. Yet the computational complexity of ML was open for over 20 years, and only recently resolved by the authors for the Jukes-Cantor model of substitution and its generalizations. It was proved that reconstructing the ML tree is computationally intractable (NP-hard). In this work we explore three directions, which extend that result. RESULTS: (1) We show that ML under the assumption of molecular clock is still computationally intractable (NP-hard). (2) We show that not only is it computationally intractable to find the exact ML tree, even approximating the logarithm of the ML for any multiplicative factor smaller than 1.00175 is computationally intractable. (3) We develop an algorithm for approximating log-likelihood under the condition that the input sequences are sparse. It employs any approximation algorithm for parsimony, and asymptotically achieves the same approximation ratio. We note that ML reconstruction for sparse inputs is still hard under this condition, and furthermore many real datasets satisfy it.  相似文献   

13.
Estimating the reliability of evolutionary trees   总被引:8,自引:1,他引:8  
Six protein sequences from the same 11 mammalian taxa were used to estimate the accuracy and reliability of phylogenetic trees using real, rather than simulated, data. A tree comparison metric was used to measure the increase in similarity of minimal trees as larger, randomly selected subsets of nucleotide positions were taken. The ratio of the observed to the expected number of incompatibilities for each nucleotide position (character) is a good predictor of the number of changes required at that position on the minimal (most-parsimonious) tree. This allows a higher weighting of nucleotide positions that have changed more slowly and should result in the minimal length tree converging to the correct tree as more sequences are obtained. An estimate was made of the smallest subset of trees that need to be considered to include the actual historical tree for a given set of data. It was concluded that it is possible to give a reasonable estimate of the reliability of the final tree, at least when several sequences are combined. With the present data, resolving the rodent- primate-lagomorph (rabbit) trichotomy is the least certain aspect of the final tree, followed then by establishing the position of dog. In our opinion, it is unreasonable to publish an evolutionary tree derived from sequence data without giving an idea of the reliability of the tree.   相似文献   

14.
From the measures of evolutionary distance between pairs ofsequences in a set, it is possible to infer the genetic treeor trees which best fit these known data. DENDRON is a new program,written in FORTRAN 66, which computes an initial tree from thebottom-up, then searches among increasingly divergent treesfor a better fit. As a check on the consistency of the measures,the program tests all triplets for the triangle inequality.DENDRON also calculates a single ‘top-down’ tree,progressing from the trunk to the twigs, for comparison withthe ‘bottom-up’ trees. Received on August 17, 1987; accepted on June 1, 1988  相似文献   

15.
Evolutionary branching, which is a coevolutionary phenomenon of the development of two or more distinctive traits from a single trait in a population, is the issue of recent studies on adaptive dynamics. In previous studies, it was revealed that trait variance is a minimum requirement for evolutionary branching, and that it does not play an important role in the formation of an evolutionary pattern of branching. Here we demonstrate that the trait evolution exhibits various evolutionary branching paths starting from an identical initial trait to different evolutional terminus traits as determined by only changing the assumption of trait variance. The key feature of this phenomenon is the topological configuration of equilibria and the initial point in the manifold of dimorphism from which dimorphic branches develop. This suggests that the existing monomorphic or polymorphic set in a population is not an unique inevitable consequence of an identical initial phenotype.  相似文献   

16.
The study of evolutionary quantitative genetics has been advanced by the use of methods developed in animal and plant breeding. These methods have proved to be very useful, but they have some shortcomings when used in the study of wild populations and evolutionary questions. Problems arise from the small size of data sets typical of evolutionary studies, and the additional complexity of the questions asked by evolutionary biologists. Here, we advocate the use of Bayesian methods to overcome these and related problems. Bayesian methods naturally allow errors in parameter estimates to propagate through a model and can also be written as a graphical model, giving them an inherent flexibility. As packages for fitting Bayesian animal models are developed, we expect the application of Bayesian methods to evolutionary quantitative genetics to grow, particularly as genomic information becomes more and more associated with environmental data.  相似文献   

17.
Adaptive sampling for Bayesian variable selection   总被引:1,自引:0,他引:1  
Nott  David J.; Kohn  Robert 《Biometrika》2005,92(4):747-763
  相似文献   

18.
Some Bayesian stratified two-phase sampling results   总被引:2,自引:0,他引:2  
  相似文献   

19.
Evolutionists dream of a tree-reconstruction method that is efficient (fast), powerful, consistent, robust and falsifiable. These criteria are at present conflicting in that the fastest methods are weak (in their use of information in the sequences) and inconsistent (even with very long sequences they may lead to an incorrect tree). But there has been exciting progress in new approaches to tree inference, in understanding general properties of methods, and in developing ideas for estimating the reliability of trees. New phylogenetic invariant methods allow selected parameters of the underlying model to be estimated directly from sequences. There is still a need for more theoretical understanding and assistance in applying what is already known.  相似文献   

20.
MOTIVATION: Evolutionary conservation estimated from a multiple sequence alignment is a powerful indicator of the functional significance of a residue and helps to predict active sites, ligand binding sites, and protein interaction interfaces. Many algorithms that calculate conservation work well, provided an accurate and balanced alignment is used. However, such a strong dependence on the alignment makes the results highly variable. We attempted to improve the conservation prediction algorithm by making it more robust and less sensitive to (1) local alignment errors, (2) overrepresentation of sequences in some branches and (3) occasional presence of unrelated sequences. RESULTS: A novel method is presented for robust constrained Bayesian estimation of evolutionary rates that avoids overfitting independent rates and satisfies the above requirements. The method is evaluated and compared with an entropy-based conservation measure on a set of 1494 protein interfaces. We demonstrated that approximately 62% of the analyzed protein interfaces are more conserved than the remaining surface at the 5% significance level. A consistent method to incorporate alignment reliability is proposed and demonstrated to reduce arbitrary variation of calculated rates upon inclusion of distantly related or unrelated sequences into the alignment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号