共查询到20条相似文献,搜索用时 0 毫秒
1.
The field of phylogenetic tree estimation has been dominated by three broad classes of methods: distance-based approaches, parsimony and likelihood-based methods (including maximum likelihood (ML) and Bayesian approaches). Here we introduce two new approaches to tree inference: pairwise likelihood estimation and a distance-based method that estimates the number of substitutions along the paths through the tree. Our results include the derivation of the formulae for the probability that two leaves will be identical at a site given a number of substitutions along the path connecting them. We also derive the posterior probability of the number of substitutions along a path between two sequences. The calculations for the posterior probabilities are exact for group-based, symmetric models of character evolution, but are only approximate for more general models. 相似文献
2.
Liang Liu Lili Yu LauraKubatko Dennis K. Pearl Scott V. Edwards 《Molecular phylogenetics and evolution》2009,53(1):320-328
We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces. 相似文献
3.
Every weighted tree corresponds naturally to a cooperative game that we call a tree game; it assigns to each subset of leaves the sum of the weights of the minimal subtree spanned by those leaves. In the context
of phylogenetic trees, the leaves are species and this assignment captures the diversity present in the coalition of species considered. We consider the Shapley value of tree games and suggest a biological interpretation.
We determine the linear transformation M that shows the dependence of the Shapley value on the edge weights of the tree, and we also compute a null space basis of
M. Both depend on the split counts of the tree. Finally, we characterize the Shapley value on tree games by four axioms, a counterpart to Shapley’s original
theorem on the larger class of cooperative games. We also include a brief discussion of the core of tree games.
Research of Francis Edward Su was partially supported by NSF Grants DMS-0301129 and DMS-0701308. 相似文献
4.
Phylogenetic trees are used to represent evolutionary relationships among biological species or organisms. The construction ofphylogenetic trees is based on the similarities or differences of their physical or genetic features. Traditional approaches ofconstructing phylogenetic trees mainly focus on physical features. The recent advancement of high-throughput technologies hasled to accumulation of huge amounts of biological data, which in turn changed the way of biological studies in various aspects. Inthis paper, we report our approach of building phylogenetic trees using the information of interacting pathways. We have appliedhierarchical clustering on two domains of organisms—eukaryotes and prokaryotes. Our preliminary results have shown theeffectiveness of using the interacting pathways in revealing evolutionary relationships. 相似文献
5.
David Faraggi R. Simon E. Yaskil A. Kramar 《Biometrical journal. Biometrische Zeitschrift》1997,39(5):519-532
Neural networks are considered by many to be very promising tools for classification and prediction. The flexibility of the neural network models often result in over-fit. Shrinking the parameters using a penalized likelihood is often used in order to overcome such over-fit. In this paper we extend the approach proposed by FARAGGI and SIMON (1995a) to modeling censored survival data using the input-output relationship associated with a single hidden layer feed-forward neural network. Instead of estimating the neural network parameters using the method of maximum likelihood, we place normal prior distributions on the parameters and make inferences based on derived posterior distributions of the parameters. This Bayesian formulation will result in shrinking the parameters of the neural network model and will reduce the over-fit compared with the maximum likelihood estimators. We illustrate our proposed method on a simulated and a real example. 相似文献
6.
Evolutionary trees from DNA sequences: A maximum likelihood approach 总被引:129,自引:0,他引:129
Joseph Felsenstein 《Journal of molecular evolution》1981,17(6):368-376
Summary The application of maximum likelihood techniques to the estimation of evolutionary trees from nucleic acid sequence data is discussed. A computationally feasible method for finding such maximum likelihood estimates is developed, and a computer program is available. This method has advantages over the traditional parsimony algorithms, which can give misleading results if rates of evolution differ in different lineages. It also allows the testing of hypotheses about the constancy of evolutionary rates by likelihood ratio tests, and gives rough indication of the error of the estimate of the tree.By acceptance of this article, the publisher and/or recipient acknowledges the U.S. government's right to retain a nonexclusive, royalty-free licence in and to any copyright covering this paperThis report was prepared as an account of work sponsored by the United States Government. Neither the United States nor the United States Department of Energy, nor any of their employees, nor any of their contractors, subcontractors, or their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness or usefulness of any information, apparatus, product or process disclosed, or represents that its use would not infringe privately-owned rights 相似文献
7.
利用DNA序列构建系统树的方法介绍 总被引:14,自引:0,他引:14
利用DNA序列进行系统发生分析是分子进化研究的必要手段。构建系统树的方法有距离法、简约法、最大似然法以及贝叶斯推断法等。要解决特定的系统发生问题,首先要挑选合理的分类群及序列,尽量减少数据的偏倚,然后选择构树方法,最后还要对结果进行评价并给出进化学上的解释。本文讨论了挑选数据的原则及存在的问题,介绍了几种构树方法的基本原理及步骤,并列举了它们的优缺点。Abstract: Construction of phylogenetic trees is a key means in molecular evolutionary studies. The methods of constructing phylogenetic trees include the distance-based methods, parsimony, maximum likelihood, and Bayesian inference methods. To resolve a special problem about phylogeny, several notices are necessary: first, to select the reasonable data at less bias as possible; second, to choose the proper method to reconstruct phylogenetic tree; third, to evaluate the conclusions and explain them on the field of evolution. The present paper provides a brief introduction of the principles of data selection and tree-construction methods, and discusses about their advantage and disadvantage points. 相似文献
8.
Joel D. Velasco 《Biology & philosophy》2008,23(4):455-473
Bayesian methods have become among the most popular methods in phylogenetics, but theoretical opposition to this methodology remains. After providing an introduction to Bayesian theory in this context, I attempt to tackle the problem mentioned most often in the literature: the “problem of the priors”—how to assign prior probabilities to tree hypotheses. I first argue that a recent objection—that an appropriate assignment of priors is impossible—is based on a misunderstanding of what ignorance and bias are. I then consider different methods of assigning prior probabilities to trees. I argue that priors need to be derived from an understanding of how distinct taxa have evolved and that the appropriate evolutionary model is captured by the Yule birth–death process. This process leads to a well-known statistical distribution over trees. Though further modifications may be necessary to model more complex aspects of the branching process, they must be modifications to parameters in an underlying Yule model. Ignoring these Yule priors commits a fallacy leading to mistaken inferences both about the trees themselves and about macroevolutionary processes more generally. 相似文献
9.
Despite the importance of molecular phylogenetics, few of its assumptions have been tested with real data. It is commonly assumed that nonparametric bootstrap values are an underestimate of the actual support, Bayesian posterior probabilities are an overestimate of the actual support, and among-gene phylogenetic conflict is low. We directly tested these assumptions by using a well-supported yeast reference tree. We found that bootstrap values were not significantly different from accuracy. Bayesian support values were, however, significant overestimates of accuracy but still had low false-positive error rates (0% to 2.8%) at the highest values (>99%). Although we found evidence for a branch-length bias contributing to conflict, there was little evidence for widespread, strongly supported among-gene conflict from bootstraps. The results demonstrate that caution is warranted concerning conclusions of conflict based on the assumption of underestimation for support values in real data. 相似文献
10.
11.
Mikael Falconnet 《Mathematical biosciences》2010,228(1):90-99
We show that the Bayesian star paradox, first proved mathematically by Steel and Matsen for a specific class of prior distributions, occurs in a wider context including less regular, possibly discontinuous, prior distributions. 相似文献
12.
Tanabe AS 《Molecular ecology resources》2011,11(5):914-921
Proportional and separate models able to apply different combination of substitution rate matrix (SRM) and among-site rate variation model (ASRVM) to each locus are frequently used in phylogenetic studies of multilocus data. A proportional model assumes that branch lengths are proportional among partitions and a separate model assumes that each partition has an independent set of branch lengths. However, the selection from among nonpartitioned (i.e., a common combination of models is applied to all-loci concatenated sequences), proportional and separate models is usually based on the researcher's preference rather than on any information criteria. This study describes two programs, 'Kakusan4' (for DNA sequences) and 'Aminosan' (for amino-acid sequences), which allow the selection of evolutionary models based on several types of information criteria. The programs can handle both multilocus and single-locus data, in addition to providing an easy-to-use wizard interface and a noninteractive command line interface. In the case of multilocus data, SRMs and ASRVMs are compared at each locus and at all-loci concatenated sequences, after which nonpartitioned, proportional and separate models are compared based on information criteria. The programs also provide model configuration files for mrbayes, paup*, phyml, raxml and Treefinder to support further phylogenetic analysis using a selected model. When likelihoods are optimized by Treefinder, the best-fit models were found to differ depending on the data set. Furthermore, differences in the information criteria among nonpartitioned, proportional and separate models were much larger than those among the nonpartitioned models. These findings suggest that selecting from nonpartitioned, proportional and separate models results in a better phylogenetic tree. Kakusan4 and Aminosan are available at http://www.fifthdimension.jp/. They are licensed under gnugpl Ver.2, and are able to run on Windows, MacOS X and Linux. 相似文献
13.
Piontkivska H 《Molecular phylogenetics and evolution》2004,31(3):865-873
Choice of a substitution model is a crucial step in the maximum likelihood (ML) method of phylogenetic inference, and investigators tend to prefer complex mathematical models to simple ones. However, when complex models with many parameters are used, the extent of noise in statistical inferences increases, and thus complex models may not produce the true topology with a higher probability than simple ones. This problem was studied using computer simulation. When the number of nucleotides used was relatively large (1000 bp), the HKY+Gamma model showed smaller d(T) topological distance between the inferred and the true trees) than the JC and Kimura models. In the cases of shorter sequences (300 bp) simpler model and search algorithm such as JC model and SA+NNI search were found to be as efficient as more complicated searches and models in terms of topological distances, although the topologies obtained under HKY+Gamma model had the highest likelihood values. The performance of relatively simple search algorithm SA+NNI was found to be essentially the same as that of more extensive SA+TBR search under all models studied. Similarly to the conclusions reached by Takahashi and Nei [Mol. Biol. Evol. 17 (2000) 1251], our results indicate that simple models can be as efficient as complex models, and that use of complex models does not necessarily give more reliable trees compared with simple models. 相似文献
14.
A recently developed mathematical model for the analysis of phylogenetic trees is applied to comparative data for 48 species. The model represents a return to fundamentals and makes no hypothesis with respect to the reversibility of the process. The species have been analysed in all subsets of three, and a measure of reliability of the results is provided. The numerical results of the computations on 17,296 triples of species are made available on the Internet. These results are discussed and the development of reliable tree structures for several species is illustrated. It is shown that, indeed, the Markov model is capable of considerably more interesting predictions than has been recognized to date. 相似文献
15.
Jaxk H. Reeves 《Journal of molecular evolution》1992,35(1):17-31
Summary Several forms of maximum likelihood models are applied to aligned amino acid sequence data coded for in the mitochondrial DNA of six species (chicken, frog, human, bovine, mouse, and rat). These models range in form from relatively simple models of the type currently used for inferring phylogenetic tree structure to models more complex than those that have been used previously. No major discrepancies between the optimal trees inferred by any of these methods are found, but there are huge differences in adequacy of fit. A very significant finding is that the fit of any of these models is vastly improved by allowing a certain proportion of the amino acid sites to be invariant. An even more important, although disquieting, finding is that none of these models fits well, as judged by standard statistical criteria. The primary reason for this is that amino acid sites undergo substitution according to a process that is very heterogeneous. Because most phylogenetic inference is accomplished by choosing the optimal tree under the assumption that a homogeneous process is acting on the sites, the potential invalidity of some such conclusions is raised by this article's results. The seriousness of this problem depends upon the robustness of the phylogenetic inferential procedure to departures from the underlying model. 相似文献
16.
Summary In this paper we present an iterative character weighting method for the construction of phyletic trees. An initial tree is used to calculate the character weights, which are the number of mutations normalized so that the possible range is corrected for. The weights obtained are used to adjust the tree; this process is iterated until a stable tree is found. Using data generated according to a model tree, we show that the trees constructed by the iterative character weighting method converge to the true underlying tree. Using biological data, the trees become closer to the systematic classification of the species concerned, and patterns conflicting with the phylogenetic pattern can be singled out. The method involves a combination of minimal length methods and similarity methods, whereby the strict parsimony criterion is relaxed. 相似文献
17.
In this paper, we provide an introductory overview to the field of phylogenetic analysis, which has wide applications in modern
biology. 相似文献
18.
Johnson KA Holland BR Heslewood MM Crayn DM 《Molecular phylogenetics and evolution》2012,62(1):146-158
For the predominantly southern hemisphere plant group Styphelioideae (Ericaceae) published sequence datasets of five markers are now available for all except one of the 38 recognised genera. However, several markers are highly incomplete therefore missing data is problematic for producing a genus level phylogeny. We explore the relative utility of supertree and supermatrix approaches for addressing this challenge, and examine the effects of missing data on tree topology and resolution. Although the supertree approach returned a more conservative hypothesis, overall, both supermatrix and supertree analyses concurred in the topologies they returned. Using multiple genes and a dataset of variably complete taxa we found improved support for the monophyly and position of the tribes and genus level relationships. However, there was mixed support for the Richeeae tribe appearing one node basal to the Cosmelieae tribe or vice versa. It is probable that this will only be resolved through further sequencing. Our study supports previous findings that the amount of data is more critical than the completeness of the dataset in estimating well-resolved trees. Our results suggest that a “serendipitous” scaffolding approach that includes a mixture of well and poorly sequenced taxa can lead to robust phylogenetic hypotheses. 相似文献
19.
20.
基于rDNA ITS序列对绒泡菌目黏菌系统发育的探讨 总被引:1,自引:0,他引:1
绒泡菌目Physarida是黏菌纲Myxogastria最大的一个目,对其系统发育关系的研究一直是根据形态特征。为了从分子水平探讨绒泡菌目乃至黏菌纲的系统发育关系,以黏菌r DNA ITS通用引物对绒泡菌目5属8种黏菌的r DNA ITS进行扩增和测序,结合Gen Bank中已有的黏菌r DNA ITS序列,利用贝叶斯推断法(Bayesian inference,BI)和最大似然法(Maximum likelihood,ML)构建系统发育树。结果表明:绒泡菌目不同物种的r DNA ITS区在碱基组成和长度上差异明显,长度为777–1 445bp,G+C mol%在53.4%–61.9%之间。绒泡菌目与发网菌目Stemonitida聚类为两个明显的分支,在绒泡菌目分支上,绒泡菌科Physaraceae和钙皮菌科Didymiaceae各聚为一支,支持了形态学上以孢丝是否具有石灰质为依据区分这两个科的观点。由多份不同地理来源的鳞钙皮菌Didymium squamulosum材料组成的钙皮菌科又形成3个分支,证实了这个形态种是由地域来源广泛、繁殖亲和性各异和遗传变异较大的不同生物种组成的复合体。 相似文献