期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Ancestral Sequence Reconstruction with Maximum Parsimony

Lina Herbst Mareike Fischer 《Bulletin of mathematical biology》2017,79(12):2865-2886

相似文献

2.

Shrinkage Effect in Ancestral Maximum Likelihood

《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(1):126-133

Ancestral maximum likelihood (AML) is a method that simultaneously reconstructs a phylogenetic tree and ancestral sequences from extant data (sequences at the leaves). The tree and ancestral sequences maximize the probability of observing the given data under a Markov model of sequence evolution, in which branch lengths are also optimized but constrained to take the same value on any edge across all sequence sites. AML differs from the more usual form of maximum likelihood (ML) in phylogenetics because ML averages over all possible ancestral sequences. ML has long been know to be statistically consistent - that is, it converges on the correct tree with probability approaching 1 as the sequence length grows. However, the statistical consistency of AML has not been formally determined, despite informal remarks in a literature that dates back 20 years. In this short note we prove a general result that implies that AML is statistically inconsistent. In particular we show that AML can 'shrink' short edges in a tree, resulting in a tree that has no internal resolution as the sequence length grows. Our results apply to any number of taxa. 相似文献

3.

Revisiting an Equivalence Between Maximum Parsimony and Maximum Likelihood Methods in Phylogenetics

Mareike Fischer Bhalchandra Thatte 《Bulletin of mathematical biology》2010,72(1):208-220

Tuffley and Steel (Bull. Math. Biol. 59:581–607, 1997) proved that maximum likelihood and maximum parsimony methods in phylogenetics are equivalent for sequences of characters under a simple symmetric model of substitution with no common mechanism. This result has been widely cited ever since. We show that small changes to the model assumptions suffice to make the two methods inequivalent. In particular, we analyze the case of bounded substitution probabilities as well as the molecular clock assumption. We show that in these cases, even under no common mechanism, maximum parsimony and maximum likelihood might make conflicting choices. We also show that if there is an upper bound on the substitution probabilities which is ‘sufficiently small’, every maximum likelihood tree is also a maximum parsimony tree (but not vice versa). 相似文献

4.

Biases in Maximum Likelihood and Parsimony: A Simulation Approach to a 10-Taxon Case 总被引：2，自引：1，他引：1

Diego Pol Mark E. Siddall 《Cladistics : the international journal of the Willi Hennig Society》2001,17(3):266-281

Biases present in maximum likelihood and parsimony are investigated through a simulation study in a 10-taxon case in which several long branches coexist with short branches in the modeled topology. The performance of these methods is explored while increasing the length of the long branches with different amounts of data. Also, simulations with different taxonomic sampling schemes are examined through this study. The presence of a strong bias in parsimony is corroborated: the well-known long-branch attraction. Likelihood performance is found to be sensitive to the mere presence extreme of branch length disparity, retrieving topologies compatible with long-branch attraction and long-branch repulsion, irrespective of the correctness of the model used. 相似文献

5.

Non-hereditary Maximum Parsimony trees

Fischer M 《Journal of mathematical biology》2012,65(2):293-308

In this paper, we investigate a conjecture by Arndt von Haeseler concerning the Maximum Parsimony method for phylogenetic estimation, which was published by the Newton Institute in Cambridge on a list of open phylogenetic problems in 2007. This conjecture deals with the question whether Maximum Parsimony trees are hereditary. The conjecture suggests that a Maximum Parsimony tree for a particular (DNA) alignment necessarily has subtrees of all possible sizes which are most parsimonious for the corresponding subalignments. We answer the conjecture affirmatively for binary alignments on 5 taxa but also show how to construct examples for which Maximum Parsimony trees are not hereditary. Apart from showing that a most parsimonious tree cannot generally be reduced to a most parsimonious tree on fewer taxa, we also show that compatible most parsimonious quartets do not have to provide a most parsimonious supertree. Last, we show that our results can be generalized to Maximum Likelihood for certain nucleotide substitution models. 相似文献

6.

Maximum Parsimony on Phylogenetic networks

Kannan L Wheeler WC 《Algorithms for molecular biology : AMB》2012,7(1):9-10

Background

Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past.

Results

In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain well-known algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores.

Conclusion

The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are common to all the branching patterns introduced by the reticulate vertices. Thus the score contains an in-built cost for the number of reticulate vertices in the network, and would provide a criterion that is comparable among all networks. Although the problem of finding the parsimony score on the network is believed to be computationally hard to solve, heuristics such as the ones described here would be beneficial in our efforts to find a most parsimonious network. 相似文献

7.

Maximum Parsimony for Tree Mixtures

《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(1):97-102

相似文献

8.

Characterizing Local Optima for Maximum Parsimony

Ellen Urheim Eric Ford Katherine St. John 《Bulletin of mathematical biology》2016,78(5):1058-1075

相似文献

9.

Bootstrap-based Support of HGT Inferred by Maximum Parsimony

Hyun Jung Park Guohua Jin Luay Nakhleh 《BMC evolutionary biology》2010,10(1):131

Background

Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. 相似文献

10.

Maximum Likelihood Estimation of Population Parameters 总被引：10，自引：5，他引：5

下载免费PDF全文

Y. X. Fu W. H. Li 《Genetics》1993,134(4):1261-1270

One of the most important parameters in population genetics is θ = 4N(e)μ where N(e) is the effective population size and μ is the rate of mutation per gene per generation. We study two related problems, using the maximum likelihood method and the theory of coalescence. One problem is the potential improvement of accuracy in estimating the parameter θ over existing methods and the other is the estimation of parameter λ which is the ratio of two θ's. The minimum variances of estimates of the parameter θ are derived under two idealized situations. These minimum variances serve as the lower bounds of the variances of all possible estimates of θ in practice. We then show that Watterson's estimate of θ based on the number of segregating sites is asymptotically an optimal estimate of θ. However, for a finite sample of sequences, substantial improvement over Watterson's estimate is possible when θ is large. The maximum likelihood estimate of λ = θ(1)/θ(2) is obtained and the properties of the estimate are discussed. 相似文献

11.

Maximum Likelihood Neural Network Prediction Models

David Faraggi Richard Simon 《Biometrical journal. Biometrische Zeitschrift》1995,37(6):713-725

Neural networks have received much attention in recent years mostly by non-statisticians. The purpose of this paper is to incorporate neural networks in a non-linear regression model and obtain maximum likelihood estimates of the network parameters using a standard Newton-Raphson algorithm. We use maximum likelihood estimators instead of the usual back-propagation technique and compare the neural network predictions with predictions of quadratic regression models and with non-parametric nearest neighbor predictions. These comparisons are made using data generated from a variety of functions. Because of the number of parameters involved, neural network models can easily over-fit the data, hence validation of results is crucial. 相似文献

12.

Sparse Sampling and Maximum Likelihood Estimation for Boolean Models

G. Ayala J. R. Ferrandiz F. Montes 《Biometrical journal. Biometrische Zeitschrift》1991,33(2):237-245

A condition for practical independence of contact distribution functions in Boolean models is obtained. This result allows the authors to use maximum likelihcod methods, via sparse sampling, for estimating unknown parameters of an isotropic Boolean model. The second part of this paper is devoted to a simulation study of the proposed method. AMS classification: 60D05 相似文献

13.

Maximum Likelihood Estimation of the Mortality Rate Function

B. N. Dimitrov S. T. Rachev A. Yu. Yakovlev 《Biometrical journal. Biometrische Zeitschrift》1985,27(3):317-326

Maximum likelihood estimator is obtained for the mortality rate function of a specific type appearing in survival data analysis. Strict consistency of this estimator is proved. 相似文献

14.

Analysis of the Distribution of Bootstrap Tree Lengths Using the Maximum Parsimony Method

Debashish Bhattacharya 《Molecular phylogenetics and evolution》1996,6(3):339-350

The bootstrap is an important tool for estimating the confidence interval of monophyletic groups within phylogenies. Although bootstrap analyses are used in most evolutionary studies, there is no clear consensus as how best to interpret bootstrap probability values. To study further the bootstrap method, nine small subunit ribosomal DNA (SSU rDNA) data sets were submitted to bootstrapped maximum parsimony (MP) analyses using unweighted and weighted sequence positions. Analyses of the lengths (i.e., parsimony steps) of the bootstrap trees show that the shape and mean of the bootstrap tree distribution may provide important insights into the evolutionary signal within the sequence data. With complex phylogenies containing nodes defined by short internal branches (multifurcations), the mean of the bootstrap tree distribution may differ by 2 standard deviations from the length of the best tree found from the original data set. Weighting sequence positions significantly increases the bootstrap values at internal nodes. There may, however, be strong bootstrap support for conflicting species groupings among different data sets. This phenomenon appears to result from a correlation between the topology of the tree used to create the weights and the topology of the bootstrap consensus tree inferred from the MP analysis of these weighted data. The analyses also show that characteristics of the bootstrap tree distribution (e.g., skewness) may be used to choose between alternative weighting schemes for phylogenetic analyses. 相似文献

15.

Boolean Models: Maximum Likelihood Estimation from Circular Clumps

G. Ayala J. Ferrndiz F. Montes 《Biometrical journal. Biometrische Zeitschrift》1990,32(1):73-78

This paper deals with the problem of making inferences on the maximum radius and the intensity of the Poisson point process associated to a Boolean Model of circular primary grains with uniformly distributed random radii. The only sample information used is observed radii of circular clumps (DUPAC, 1980). The behaviour of maximum likelihood estimation has been evaluated by means of Monte Carlo methods. 相似文献

16.

Maximum Parsimony and the Skewness Test: A Simulation Study of the Limits of Applicability

Jussi M??tt? Teemu Roos 《PloS one》2016,11(4)

The maximum parsimony (MP) method for inferring phylogenies is widely used, but little is known about its limitations in non-asymptotic situations. This study employs large-scale computations with simulated phylogenetic data to estimate the probability that MP succeeds in finding the true phylogeny for up to twelve taxa and 256 characters. The set of candidate phylogenies are taken to be unrooted binary trees; for each simulated data set, the tree lengths of all (2n − 5)!! candidates are computed to evaluate quantities related to the performance of MP, such as the probability of finding the true phylogeny, the probability that the tree with the shortest length is unique, the probability that the true phylogeny has the shortest tree length, and the expected inverse of the number of trees sharing the shortest length. The tree length distributions are also used to evaluate and extend the skewness test of Hillis for distinguishing between random and phylogenetic data. The results indicate, for example, that the critical point after which MP achieves a success probability of at least 0.9 is roughly around 128 characters. The skewness test is found to perform well on simulated data and the study extends its scope to up to twelve taxa. 相似文献

17.

Maximum Likelihood Estimates for Binary Data with Random Effects

Haiganoush K. Preisler 《Biometrical journal. Biometrische Zeitschrift》1988,30(3):339-350

The purpose of this paper is to present a procedure for obtaining approximate maximum likelihood estimates for compound binary response models. The extra binomial variation is incorporated into the model by adding random effects to the fixed effects on the probit (or logit) scale. Numerical integration techniques are used to arrive at a solution of the likelihood equations. The paper also presents an illustrating numerical example based on a large toxicological data set. The computations are carried out within the GLIM statistical package. 相似文献

18.

Maximum Likelihood Estimation of Simultaneous Pairwise Linear Structural Relationships

Heleno Bolfarine Manuel Galea Rojas 《Biometrical journal. Biometrische Zeitschrift》1995,37(6):673-689

The problem of assessing the relative calibrations and relative accuracies of a set of p instruments, each designed to measure the same characteristic on a common group of individuals is considered by using the EM algorithm. As shown, the EM algorithm provides a general solution for this problem. Its implementation is simple and in its most general form requires no extra iterative procedures within the M step. One important feature of the algorithm in this set up is that the error variance estimates are always positive. Thus, it can be seen as a kind of restricted maximization procedure. The expected information matrix for the maximum likelihood estimators is derived, upon which the large sample estimated covariance matrix for the maximum likelihood estimators can be computed. The problem of testing hypothesis about the calibration lines can be approached by using the Wald statistics. The approach is illustrated by re-analysing two data sets in the literature. 相似文献

19.

Maximum Likelihood Analysis of Population Differences in Allelic Frequencies 总被引：3，自引：2，他引：1

Peter E. Smouse Ken-Ichi Kojima 《Genetics》1972,72(4):709-719

Statistical techniques are presented for the analysis of geographic variation in allelic frequencies. Likelihood ratio test criteria are derived from a multinominal sampling distribution, and are used to answer three questions. (1) Are there geographic differences in allelic frequencies? (2) Are population differences in allelic frequencies associated with environmental differences? (3) Is there any residual "lack of fit" variation among populations, after accounting for that variation associated with environmental differences? The two- and three-allele cases are explicitly treated, and the extension to more alleles is indicated. 相似文献

20.

Success of Parsimony in the Four-Taxon Case: Long-Branch Repulsion by Likelihood in the Farris Zone

Mark E Siddall 《Cladistics : the international journal of the Willi Hennig Society》1998,14(3):209-220

The accuracy of phylogenetic methods is reinvestigated for the four-taxon case with a two-edge rate and a three-edge rate. Unlike previous studies involving computer simulations, the two-edge rate relates to branches that are sister taxa in the model tree. As with previous studies, certain methods are found to behave inaccurately in a portion of the parameter space where the two-edge rate is proportionally large. This phenomenon, to which parsimony is immune, is termed “long-branch repulsion” and the region of poor performance is called the Farris Zone. Maximum likelihood methods are shown to be particularly prone to failure when closely related taxa have long branches. Long-branch repulsion is demonstrated with an empirical case involving Strepsiptera and Diptera. 相似文献