首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We explore the maximum parsimony (MP) and ancestral maximum likelihood (AML) criteria in phylogenetic tree reconstruction. Both problems are NP-hard, so we seek approximate solutions. We formulate the two problems as Steiner tree problems under appropriate distances. The gist of our approach is the succinct characterization of Steiner trees for a small number of leaves for the two distances. This enables the use of known Steiner tree approximation algorithms. The approach leads to a 16/9 approximation ratio for AML and asymptotically to a 1.55 approximation ratio for MP.  相似文献   

2.
In this paper, we investigate a conjecture by Arndt von Haeseler concerning the Maximum Parsimony method for phylogenetic estimation, which was published by the Newton Institute in Cambridge on a list of open phylogenetic problems in 2007. This conjecture deals with the question whether Maximum Parsimony trees are hereditary. The conjecture suggests that a Maximum Parsimony tree for a particular (DNA) alignment necessarily has subtrees of all possible sizes which are most parsimonious for the corresponding subalignments. We answer the conjecture affirmatively for binary alignments on 5 taxa but also show how to construct examples for which Maximum Parsimony trees are not hereditary. Apart from showing that a most parsimonious tree cannot generally be reduced to a most parsimonious tree on fewer taxa, we also show that compatible most parsimonious quartets do not have to provide a most parsimonious supertree. Last, we show that our results can be generalized to Maximum Likelihood for certain nucleotide substitution models.  相似文献   

3.
4.

Background

Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past.

Results

In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain well-known algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores.

Conclusion

The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are common to all the branching patterns introduced by the reticulate vertices. Thus the score contains an in-built cost for the number of reticulate vertices in the network, and would provide a criterion that is comparable among all networks. Although the problem of finding the parsimony score on the network is believed to be computationally hard to solve, heuristics such as the ones described here would be beneficial in our efforts to find a most parsimonious network.  相似文献   

5.
6.
7.
The time-dependent-asymmetric-linear parsimony is an ancestral state reconstruction method which extends the standard linear parsimony (a.k.a. Wagner parsimony) approach by taking into account both branch lengths and asymmetric evolutionary costs for reconstructing quantitative characters (asymmetric costs amount to assuming an evolutionary trend toward the direction with the lowest cost). A formal study of the influence of the asymmetry parameter shows that the time-dependent-asymmetric-linear parsimony infers states which are all taken among the known states, except for some degenerate cases corresponding to special values of the asymmetry parameter. This remarkable property holds in particular for the Wagner parsimony. This study leads to a polynomial algorithm which determines, and provides a compact representation of, the parametric reconstruction of a phylogenetic tree, that is for all the unknown nodes, the set of all the possible reconstructed states associated with the asymmetry parameters leading to them. The time-dependent-asymmetric-linear parsimony is finally illustrated with the parametric reconstruction of the body size of cetaceans.  相似文献   

8.
Tuffley and Steel (Bull. Math. Biol. 59:581–607, 1997) proved that maximum likelihood and maximum parsimony methods in phylogenetics are equivalent for sequences of characters under a simple symmetric model of substitution with no common mechanism. This result has been widely cited ever since. We show that small changes to the model assumptions suffice to make the two methods inequivalent. In particular, we analyze the case of bounded substitution probabilities as well as the molecular clock assumption. We show that in these cases, even under no common mechanism, maximum parsimony and maximum likelihood might make conflicting choices. We also show that if there is an upper bound on the substitution probabilities which is ‘sufficiently small’, every maximum likelihood tree is also a maximum parsimony tree (but not vice versa).  相似文献   

9.

Background  

Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold.  相似文献   

10.
Ancestral maximum likelihood (AML) is a method that simultaneously reconstructs a phylogenetic tree and ancestral sequences from extant data (sequences at the leaves). The tree and ancestral sequences maximize the probability of observing the given data under a Markov model of sequence evolution, in which branch lengths are also optimized but constrained to take the same value on any edge across all sequence sites. AML differs from the more usual form of maximum likelihood (ML) in phylogenetics because ML averages over all possible ancestral sequences. ML has long been know to be statistically consistent - that is, it converges on the correct tree with probability approaching 1 as the sequence length grows. However, the statistical consistency of AML has not been formally determined, despite informal remarks in a literature that dates back 20 years. In this short note we prove a general result that implies that AML is statistically inconsistent. In particular we show that AML can 'shrink' short edges in a tree, resulting in a tree that has no internal resolution as the sequence length grows. Our results apply to any number of taxa.  相似文献   

11.
In tRNA maturation, CCA-addition by tRNA nucleotidyltransferase is a unique and highly accurate reaction. While the mechanism of nucleotide selection and polymerization is well understood, it remains a mystery why bacterial and eukaryotic enzymes exhibit an unexpected and surprisingly low tRNA substrate affinity while they efficiently catalyze the CCA-addition. To get insights into the evolution of this high-fidelity RNA synthesis, the reconstruction and characterization of ancestral enzymes is a versatile tool. Here, we investigate a reconstructed candidate of a 2 billion years old CCA-adding enzyme from Gammaproteobacteria and compare it to the corresponding modern enzyme of Escherichia coli. We show that the ancestral candidate catalyzes an error-free CCA-addition, but has a much higher tRNA affinity compared with the extant enzyme. The consequence of this increased substrate binding is an enhanced reverse reaction, where the enzyme removes the CCA end from the mature tRNA. As a result, the ancestral candidate exhibits a lower catalytic efficiency in vitro as well as in vivo. Furthermore, the efficient tRNA interaction leads to a processive polymerization, while the extant enzyme catalyzes nucleotide addition in a distributive way. Thus, the modern enzymes increased their polymerization efficiency by lowering the binding affinity to tRNA, so that CCA synthesis is efficiently promoted due to a reduced reverse reaction. Hence, the puzzling and at a first glance contradicting and detrimental weak substrate interaction represents a distinct activity enhancement in the evolution of CCA-adding enzymes.  相似文献   

12.
The postsynaptic density extends across the postsynaptic dendritic spine with discs large (DLG) as the most abundant scaffolding protein. DLG dynamically alters the structure of the postsynaptic density, thus controlling the function and distribution of specific receptors at the synapse. DLG contains three PDZ domains and one important interaction governing postsynaptic architecture is that between the PDZ3 domain from DLG and a protein called cysteine-rich interactor of PDZ3 (CRIPT). However, little is known regarding functional evolution of the PDZ3:CRIPT interaction. Here, we subjected PDZ3 and CRIPT to ancestral sequence reconstruction, resurrection, and biophysical experiments. We show that the PDZ3:CRIPT interaction is an ancient interaction, which was likely present in the last common ancestor of Eukaryotes, and that high affinity is maintained in most extant animal phyla. However, affinity is low in nematodes and insects, raising questions about the physiological function of the interaction in species from these animal groups. Our findings demonstrate how an apparently established protein–protein interaction involved in cellular scaffolding in bilaterians can suddenly be subject to dynamic evolution including possible loss of function.  相似文献   

13.
14.
The relation of sequence with specificity in membrane transporters is challenging to explore. Most relevant studies until now rely on comparisons of present-day homologs. In this work, we study a set of closely related transporters by employing an evolutionary, ancestral-reconstruction approach and reveal unexpected new specificity determinants. We analyze a monophyletic group represented by the xanthine-specific XanQ of Escherichia coli in the Nucleobase-Ascorbate Transporter/Nucleobase-Cation Symporter-2 (NAT/NCS2) family. We reconstructed AncXanQ, the putative common ancestor of this clade, expressed it in E. coli K-12, and found that, in contrast to XanQ, it encodes a high-affinity permease for both xanthine and guanine, which also recognizes adenine, hypoxanthine, and a range of analogs. AncXanQ conserves all binding-site residues of XanQ and differs substantially in only five intramembrane residues outside the binding site. We subjected both homologs to rationally designed mutagenesis and present evidence that these five residues are linked with the specificity change. In particular, we reveal Ser377 of XanQ (Gly in AncXanQ) as a major determinant. Replacement of this Ser with Gly enlarges the specificity of XanQ towards an AncXanQ-phenotype. The ortholog from Neisseria meningitidis retaining Gly at this position is also a xanthine/guanine transporter with extended substrate profile like AncXanQ. Molecular Dynamics shows that the S377G replacement tilts transmembrane helix 12 resulting in rearrangement of Phe376 relative to Phe94 in the XanQ binding pocket. This effect may rationalize the enlarged specificity. On the other hand, the specificity effect of S377G can be masked by G27S or other mutations through epistatic interactions.  相似文献   

15.
Obligate symbionts typically exhibit high evolutionary rates. Consequently, their proteins may differ considerably from their modern and ancestral homologs in terms of both sequence and properties, thus providing excellent models to study protein evolution. Also, obligate symbionts are challenging to culture in the lab and proteins from uncultured organisms must be produced in heterologous hosts using recombinant DNA technology. Obligate symbionts thus replicate a fundamental scenario of metagenomics studies aimed at the functional characterization and biotechnological exploitation of proteins from the bacteria in soil. Here, we use the thioredoxin from Candidatus Photodesmus katoptron, an uncultured symbiont of flashlight fish, to explore evolutionary and engineering aspects of protein folding in heterologous hosts. The symbiont protein is a standard thioredoxin in terms of 3D-structure, stability and redox activity. However, its folding outside the original host is severely impaired, as shown by a very slow refolding in vitro and an inefficient expression in E. coli that leads mostly to insoluble protein. By contrast, resurrected Precambrian thioredoxins express efficiently in E. coli, plausibly reflecting an ancient adaptation to unassisted folding. We have used a statistical-mechanical model of the folding landscape to guide back-to-ancestor engineering of the symbiont protein. Remarkably, we find that the efficiency of heterologous expression correlates with the in vitro (i.e., unassisted) folding rate and that the ancestral expression efficiency can be achieved with only 1–2 back-to-ancestor replacements. These results demonstrate a minimal-perturbation, sequence-engineering approach to rescue inefficient heterologous expression which may potentially be useful in metagenomics efforts targeting recent adaptations.  相似文献   

16.
17.

Background  

The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments.  相似文献   

18.
The bootstrap is an important tool for estimating the confidence interval of monophyletic groups within phylogenies. Although bootstrap analyses are used in most evolutionary studies, there is no clear consensus as how best to interpret bootstrap probability values. To study further the bootstrap method, nine small subunit ribosomal DNA (SSU rDNA) data sets were submitted to bootstrapped maximum parsimony (MP) analyses using unweighted and weighted sequence positions. Analyses of the lengths (i.e., parsimony steps) of the bootstrap trees show that the shape and mean of the bootstrap tree distribution may provide important insights into the evolutionary signal within the sequence data. With complex phylogenies containing nodes defined by short internal branches (multifurcations), the mean of the bootstrap tree distribution may differ by 2 standard deviations from the length of the best tree found from the original data set. Weighting sequence positions significantly increases the bootstrap values at internal nodes. There may, however, be strong bootstrap support for conflicting species groupings among different data sets. This phenomenon appears to result from a correlation between the topology of the tree used to create the weights and the topology of the bootstrap consensus tree inferred from the MP analysis of these weighted data. The analyses also show that characteristics of the bootstrap tree distribution (e.g., skewness) may be used to choose between alternative weighting schemes for phylogenetic analyses.  相似文献   

19.
The maximum parsimony (MP) method for inferring phylogenies is widely used, but little is known about its limitations in non-asymptotic situations. This study employs large-scale computations with simulated phylogenetic data to estimate the probability that MP succeeds in finding the true phylogeny for up to twelve taxa and 256 characters. The set of candidate phylogenies are taken to be unrooted binary trees; for each simulated data set, the tree lengths of all (2n − 5)!! candidates are computed to evaluate quantities related to the performance of MP, such as the probability of finding the true phylogeny, the probability that the tree with the shortest length is unique, the probability that the true phylogeny has the shortest tree length, and the expected inverse of the number of trees sharing the shortest length. The tree length distributions are also used to evaluate and extend the skewness test of Hillis for distinguishing between random and phylogenetic data. The results indicate, for example, that the critical point after which MP achieves a success probability of at least 0.9 is roughly around 128 characters. The skewness test is found to perform well on simulated data and the study extends its scope to up to twelve taxa.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号