首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 948 毫秒
1.
Parsimony methods infer phylogenetic trees by minimizing number of character changes required to explain observed character states. From the perspective of applicability of parsimony methods, it is important to assess whether the characters used to infer phylogeny are likely to provide a correct tree. We introduce a graph theoretical characterization that helps to assess whether given set of characters is appropriate to use with parsimony methods. Given a set of characters and a set of taxa, we construct a network called character overlap graph. We show that the character overlap graph for characters that are appropriate to use in parsimony methods is characterized by significant under-representation of subnetworks known as holes, and provide a validation for this observation. This characterization explains success in constructing evolutionary trees using parsimony method for some characters (e.g., protein domains) and lack of such success for other characters (e.g., introns). In the latter case, the understanding of obstacles to applying parsimony methods in a direct way has lead us to a new approach for detecting inconsistent and/or noisy data. Namely, we introduce the concept of stable characters which is similar but less restrictive than the well known concept of pairwise compatible characters. Application of this approach to introns produces the evolutionary tree consistent with the Coelomata hypothesis.  相似文献   

2.
D. Huson and M. Steel showed that for any two binary phylogenetic trees on the same set of n taxa, there exists a sequence of multistate characters that is homoplasy-free only on the first tree but perfectly additive only on the second one. The original construction of such a sequence required n - 1 character states and it remained an open question whether a sequence using fewer character states can always be found. In the present note we will answer this question by showing that three character states suffice to construct such misleading sequences--even if we insist that they conform to an ultrametric (i.e., fit a molecular clock).  相似文献   

3.
ESTIMATING CHARACTER WEIGHTS DURING TREE SEARCH   总被引:9,自引:2,他引:7  
Abstract— A new method for weighting characters according to their homoplasy is proposed; the method is non-iterative and does not require independent estimations of weights. It is based on searching trees with maximum total fit, with character fits defined as a concave function of homoplasy. Then, when comparing trees, differences in steps occurring in characters which show more homoplasy on the trees are less influential. The reliability of the characters is estimated, during the analysis, as a logical implication of the trees being compared. The "fittest" trees imply that the characters are maximally reliable and, given character conflict, have fewer steps for the characters which fit the tree better. If other trees save steps in some characters, it will be at the expense of gaining them in characters with less homoplasy.  相似文献   

4.
Several empirical studies suggest that sexually selected characters, including bird plumage, may evolve rapidly and show high levels of convergence and other forms of homoplasy. However, the processes that might generate such convergence have not been explored theoretically. Furthermore, no studies have rigorously addressed this issue using a robust phylogeny and a large number of signal characters. We scored the appearance of 44 adult male plumage characters that varied across New World orioles (Icterus). We mapped the plumage characters onto a molecular phylogeny based on two mitochondrial genes. Reconstructing the evolution of these characters revealed evidence of convergence or reversal in 42 of the 44 plumage characters. No plumage character states are restricted to any groups of species higher than superspecies in the oriole phylogeny. The high frequency of convergence and reversal is reflected in the low overall retention index (RI = 0.66) and the low overall consistency index (CI = 0.28). We found similar results when we mapped plumage changes onto a total evidence tree. Our findings reveal that plumage patterns and colors are highly labile between species of orioles, but highly conserved within the oriole genus. Furthermore, there are at least two overall plumage types that have convergently evolved repeatedly in the three oriole clades. This overall convergence leads to significant conflict between the molecular and plumage data. It is not clear what evolutionary processes lead to this homoplasy in individual characters or convergence in overall pattern. However, evolutionary constraints such as developmental limitations and genetic correlations between characters are likely to play a role. Our results are consistent with the belief that avian plumage and other sexually selected characters may evolve rapidly and may exhibit high homoplasy. The overall convergence in oriole plumage patterns is an interesting evolutionary phenomenon, but it cautions against heavy reliance on plumage characters for constructing phylogenies.  相似文献   

5.
Circularity and Independence in Phylogenetic Tests of Ecological Hypotheses   总被引:5,自引:0,他引:5  
It has been asserted that in order to avoid circularity in phylogenetic tests of ecological hypotheses, one must exclude from the cladistic analysis any characters that might be correlated with that hypothesis. The argument assumes that selective correlation leads to lack of independence among characters and may thus bias the analysis. This argument conflates the idea of independence between the ecological hypothesis and the phylogeny with independence among characters used to construct the tree. We argue that adaptation or selection does not necessarily result in the non-independence of characters, and that characters for a cladistic analysis should be evaluated as homology statements rather than functional ones. As with any partitioning of data, character exclusion may lead to weaker phylogenetic hypotheses, and the practice of mapping characters onto a tree, rather than including them in the analysis, should be avoided. Examples from pollination biology are used to illustrate some of the theoretical and practical problems inherent in character exclusion.  相似文献   

6.
The notion that two characters evolve independently is of interest for two reasons. First, theories of biological integration often predict that change in one character requires complementary change in another. Second, character independence is a basic assumption of most phylogenetic inference methods, and dependent characters might confound attempts at phylogenetic inference. Previously proposed tests of correlated character evolution require a model phylogeny and therefore assume that nonphylogenetic correlation has a negligible effect on initial tree construction. This paper develops "tree-free" methods for testing the independence of cladistic characters. These methods can test the character independence model as a hypothesis before phylogeny reconstruction, or can be used simply to test for correlated evolution. We first develop an approach for visualizing suites of correlated characters by using character compatibility. Two characters are compatible if they can be used to construct a tree without homoplasy. The approach is based on the examination of mutual compatibilities between characters. The number of times two characters i and j share compatibility with a third character is calculated, and a pairwise shared compatibility matrix is constructed. From this matrix, an association matrix analogous to a dissimilarity matrix is derived. Eigenvector analyses of this association matrix reveal suites of characters with similar compatibility patterns. A priori character subsets can be tested for significant correlation on these axes. Monte Carlo tests are performed to determine the expected distribution of mutual compatibilities, given various criteria from the original data set. These simulated distributions are then used to test whether the observed amounts of nonphylogenetic correlation in character suites can be attributed to chance alone. We have applied these methods to published morphological data for caecilian amphibians. The analyses corroborate instances of dependent evolution hypothesized by previous workers and also identify novel partitions. Phylogenetic analysis is performed after reducing correlated suites to single characters. The resulting cladogram has greater topological resolution and implies appreciably less change among the remaining characters than does a tree derived from the raw data matrix.  相似文献   

7.

Background

Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past.

Results

In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain well-known algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores.

Conclusion

The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are common to all the branching patterns introduced by the reticulate vertices. Thus the score contains an in-built cost for the number of reticulate vertices in the network, and would provide a criterion that is comparable among all networks. Although the problem of finding the parsimony score on the network is believed to be computationally hard to solve, heuristics such as the ones described here would be beneficial in our efforts to find a most parsimonious network.  相似文献   

8.
Abstract— The stability of each clade resolved by a data set can be assessed as the minimum number of characters that, when removed, cause resolution of the clade to be lost; a clade is regarded as having been lost when it does occur in the strict consensus tree. The clade stability index (CSI) is the ratio of this minimum number of characters to the number of informative characters in the data set. The CSI of a clade can range from 0 (absence from the consensus tree of the complete data set) to 1 (all informative characters must be removed for the clade to fail to be resolved). Minimum character removal scores are discoverable by a procedure known as successive character removal, in which separate cladistic analyses are conducted of all possible data sets derived by the removal of individual characters and character combinations of successively increasing number.  相似文献   

9.
Random trees and random characters can be used in null models for testing phylogenetic hypothesis. We consider three interpretations of random trees: first, that trees are selected from the set of all possible trees with equal probability; second, that trees are formed by random speciation or coalescence (equivalent); and third, that trees are formed by a series of random partitions of the taxa. We consider two interpretations of random characters: first, that the number of taxa with each state is held constant, but the states are randomly reshuffled among the taxa; and second, that the probability each taxon is assigned a particular state is constant from one taxon to the next. Under null models representing various combinations of randomizations of trees and characters, exact recursion equations are given to calculate the probability distribution of the number of character state changes required by a phylogenetic tree. Possible applications of these probability distributions are discussed. They can be used, for example, to test for a panmictic population structure within a species or to test phylogenetic inertia in a character's evolution. Whether and how a null model incorporates tree randomness makes little difference to the probability distribution in many but not all circumstances. The null model's sense of character randomness appears more critical. The difficult issue of choosing a null model is discussed.  相似文献   

10.
Given a collection of discrete characters (e.g., aligned DNA sites, gene adjacencies), a common measure of distance between taxa is the proportion of characters for which taxa have different character states. Tree reconstruction based on these (uncorrected) distances can be statistically inconsistent and can lead to trees different from those obtained using character-based methods such as maximum likelihood or maximum parsimony. However, in these cases the distance data often reveal their unreliability by some deviation from additivity, as indicated by conflicting support for more than one tree. We describe two results that show how uncorrected (and miscorrected) distance data can be simultaneously perfectly additive and misleading. First, multistate character data can be perfectly compatible and define one tree, and yet the uncorrected distances derived from these characters are perfectly treelike (and obey a molecular clock), only for a completely different tree. Second, under a Markov model of character evolution a similar phenomenon can occur; not only is there statistical inconsistency using uncorrected distances, but there is no evidence of this inconsistency because the distances look perfectly treelike (this does not occur in the classic two-parameter Felsenstein zone). We characterize precisely when uncorrected distances are additive on the true (and on a false) tree for four taxa. We also extend this result to a more general setting that applies to distances corrected according to an incorrect model.  相似文献   

11.
The Loranthaceae is the largest plant family with aerial branch parasites termed mistletoes. Three genera of Loranthaceae are terrestrial root parasites and the remaining 72 genera are aerial parasites. Several characters, including habit, haustorial type, germination pattern, pollen morphology, chromosome number, inflorescence morphology and flower merosity, fusion, symmetry and size, are considered to reflect evolutionary relationships within the family. Convergence is a common evolutionary pattern and can confound interpretations of evolution. We investigated character evolution by mapping character states onto a phylogenetic tree based on the nuclear ITS and chloroplast trnL–trnF regions. Convergences in form were found in several characters, including habit, haustorial type, flower symmetry and merosity. These convergences typically correspond to ecological parameters such as pollination syndrome or stresses associated with the canopy habit. Other characters such as chromosome number and germination pattern illustrate divergent evolution among clades.  © 2006 The Linnean Society of London, Botanical Journal of the Linnean Society , 2006, 150 , 101–113.  相似文献   

12.
Because Pee-Wee (version 2.5.1 and earlier versions) treats all within-terminal steps (internal steps) as homoplasies, if the command "ccode = (space).;" is in effect, it reports incorrect, and consistently too low, fit values for characters having states that occur only within polymorphic terminals (WPTO states). For matrices with WPTO states, this property of Pee-Wee may bias tree search in favour of characters having fewer WPTO states, other things being equal. Contrary to the intended purpose of the Goloboff criterion, Pee-Wee may even, in certain cases involving WPTO states, assign higher weight (fit) to characters with more homoplasy and therefore resolve character conflict in favour of less reliable characters.  相似文献   

13.
Current phylogenetic methods attempt to account for evolutionary rate variation across characters in a matrix. This is generally achieved by the use of sophisticated evolutionary models, combined with dense sampling of large numbers of characters. However, systematic biases and superimposed substitutions make this task very difficult. Model adequacy can sometimes be achieved at the cost of adding large numbers of free parameters, with each parameter being optimized according to some criterion, resulting in increased computation times and large variances in the model estimates. In this study, we develop a simple approach that estimates the relative evolutionary rate of each homologous character. The method that we describe uses the similarity between characters as a proxy for evolutionary rate. In this article, we work on the premise that if the character-state distribution of a homologous character is similar to many other characters, then this character is likely to be relatively slowly evolving. If the character-state distribution of a homologous character is not similar to many or any of the rest of the characters in a data set, then it is likely to be the result of rapid evolution. We show that in some test cases, at least, the premise can hold and the inferences are robust. Importantly, the method does not use a "starting tree" to make the inference and therefore is tree independent. We demonstrate that this approach can work as well as a maximum likelihood (ML) approach, though the ML method needs to have a known phylogeny, or at least a very good estimate of that phylogeny. We then demonstrate some uses for this method of analysis, including the improvement in phylogeny reconstruction for both deep-level and recent relationships and overcoming systematic biases such as base composition bias. Furthermore, we compare this approach to two well-established methods for reweighting or removing characters. These other methods are tree-based and we show that they can be systematically biased. We feel this method can be useful for phylogeny reconstruction, understanding evolutionary rate variation, and for understanding selection variation on different characters.  相似文献   

14.
Recent studies have shown that addition or deletion of taxa from a data matrix can change the estimate of phylogeny. I used 29 data sets from the literature to examine the effect of taxon sampling on phylogeny estimation within data sets. I then used multiple regression to assess the effect of number of taxa, number of characters, homoplasy, strength of support, and tree symmetry on the sensitivity of data sets to taxonomic sampling. Sensitivity to sampling was measured by mapping characters from a matrix of culled taxa onto optimal trees for that reduced matrix and onto the pruned optimal tree for the entire matrix, then comparing the length of the reduced tree to the length of the pruned complete tree. Within-data-set patterns can be described by a second-order equation relating fraction of taxa sampled to sensitivity to sampling. Multiple regression analyses found number of taxa to be a significant predictor of sensitivity to sampling; retention index, number of informative characters, total support index, and tree symmetry were nonsignificant predictors. I derived a predictive regression equation relating fraction of taxa sampled and number of taxa potentially sampled to sensitivity to taxonomic sampling and calculated values for this equation within the bounds of the variables examined. The length difference between the complete tree and a subsampled tree was generally small (average difference of 0-2.9 steps), indicating that subsampling taxa is probably not an important problem for most phylogenetic analyses using up to 20 taxa.  相似文献   

15.
New examples are presented, showing that supertree methods such as matrix representation with parsimony, minimum flip trees, and compatibility analysis of the matrix representing the input trees, produce supertrees that cannot be interpreted as displaying the groups present in the majority of the input trees. These methods may produce a supertree displaying some groups present in the minority of the trees, and contradicted by the majority. Of the three methods, compatibility analysis is the least used, but it seems to be the one that differs the least from majority rule consensus. The three methods are similar in that they choose the supertree(s) that best fit the set of input trees (quantified as some measure of the fit to the matrix representation of the input trees); in the case of complete trees, it is argued that, for a supertree method to be equivalent to majority rule or frequency difference consensus, two necessary (but not sufficient) conditions must be met. First, the measure of fit between a supertree and an input tree must be symmetrical. Second, the fit for a character representing a group must be measured as absolute: either it fits or it does not fit. In the restricted case of complete and equally resolved input trees, compatibility analysis (unlike MRP and minimum flipping) fulfils these two conditions: it is symmetrical (i.e., as long as the trees have the same taxon sets and are equally resolved, the number of characters in the matrix representation of tree A that require homoplasy in tree B is always the same as the number of characters in the matrix representation of tree B that require homoplasy in tree A) and it measures fit as all‐or‐none. In the case of just two complete and equally resolved input trees, the two conditions (symmetry and absolute fit) are necessary and sufficient, which explains why the compatibility analysis of such trees behaves as majority consensus. With more than two such trees, these conditions are still necessary but no longer sufficient for the equivalence; in such cases, the compatibility supertree may differ significantly from the majority rule consensus, even when these conditions apply (as shown by example). MRP and minimum flipping are asymmetric and measure various degrees of fit for each character, which explains why they often behave very differently from majority rule procedures, and why they are very likely to have groups contradicted by each of the input trees, or groups supported by a minority of the input trees. © The Willi Hennig Society 2005.  相似文献   

16.
Abstract. Historically, characters from early animal development have been a potentially rich source of phylogenetic information, but many traits associated with the gametes and larval stages of animals with complex life cycles are widely suspected to have evolved frequent convergent similarities. Such convergences will confound true phylogenetic relationships. We compared phylogenetic inferences based on early life history traits with those from mitochondrial DNA sequences for sea stars in the genera Asterina, Cryptasterina , and Patiriella (Valvatida: Asterinidae). Analysis of these two character sets produced phylogenies that shared few clades. We quantified the degree of homoplasy in each character set when mapped onto the phylogeny inferred from the alternative characters. The incongruence between early life history and nucleotide characters implies more homoplasy in the life history character set. We suggest that the early life history traits in this case are most likely to be misleading as phylogenetic characters because simple adaptive models predict convergence in early life histories. We show that adding early life history characters may slightly improve a phylogeny based on nucleotide sequences, but adding nucleotide characters may be critically important to improving inferences from phylogenies based on early life history characters.  相似文献   

17.
THE EFFECT OF ORDERED CHARACTERS ON PHYLOGENETIC RECONSTRUCTION   总被引:2,自引:0,他引:2  
Abstract Morphological structures are likely to undergo more than a single change during the course of evolution. As a result, multistate characters are common in systematic studies and must be dealt with. Particularly interesting is the question of whether or not multistate characters should be treated as ordered (additive) or unordered (non-additive). In accepting a particular hypothesis of order, numerous others are necessarily rejected. We review some of the criteria often used to order character states and the underlying assumptions inherent in these criteria.
The effects that ordered multistate characters can have on phylogenetic reconstruction are examined using 27 data sets. It has been suggested that hypotheses of character state order are more informative then hypotheses of unorder and may restrict the number of equally parsimonious trees as well as increase tree resolution. Our results indicate that ordered characters can produce more, equal or less equally parsimonious trees and can increase, decrease or have no effect on tree resolution. The effect on tree resolution can be a simple gain in resolution or a dramatic change in sister-taxa relationships. In cases where several outgroups are included in the data matrix, hypotheses of order can change character polarities by altering outgroup topology. Ordered characters result in a different topology from unordered characters only when the hierarchy of the cladogram disagrees with the investigator's a priori hypothesis of order. If the best criterion for assessing character evolution is congruence with other characters, the practice of ordering multistate characters is inappropriate.  相似文献   

18.
Many questions in evolutionary biology are best addressed by comparing traits in different species. Often such studies involve mapping characters on phylogenetic trees. Mapping characters on trees allows the nature, number, and timing of the transformations to be identified. The parsimony method is the only method available for mapping morphological characters on phylogenies. Although the parsimony method often makes reasonable reconstructions of the history of a character, it has a number of limitations. These limitations include the inability to consider more than a single change along a branch on a tree and the uncoupling of evolutionary time from amount of character change. We extended a method described by Nielsen (2002, Syst. Biol. 51:729-739) to the mapping of morphological characters under continuous-time Markov models and demonstrate here the utility of the method for mapping characters on trees and for identifying character correlation.  相似文献   

19.
In comparative biology, pairwise comparisons of species or genes (terminal taxa) are used to detect character associations. For instance, if pairs of species contrasting in the state of a particular character are examined, the member of a pair with a particular state might be more likely than the other member to show a particular state in a second character. Pairs are chosen so as to be phylogenetically separate, that is, the path between members of a pair, along the branches of the tree, does not touch the path of any other pair. On a given phylogenetic tree, pairs must be chosen carefully to achieve the maximum possible number of pairs while maintaining phylogenetic separation. Many alternative sets of pairs may have this maximum number. Algorithms are developed that find all taxon pairings that maximize the number of pairs without constraint, or with the constraint that members of each pair have contrasting states in a binary character, or that they have contrasting states in two binary characters. The comparisons chosen by these algorithms, although phylogenetically separate on the tree, are not necessarily statistically independent.  相似文献   

20.
Many phylogenetic analyses that include numerous terminals but few genes show high resolution and branch support for relatively recently diverged clades, but lack of resolution and/or support for "basal" clades of the tree. The various benefits of increased taxon and character sampling have been widely discussed in the literature, albeit primarily based on simulations rather than empirical data. In this study, we used a well-sampled gene-tree analysis (based on 100 mitochondrial genomes of higher teleost fishes) to test empirically the efficiency of different methods of data sampling and phylogenetic inference to "correctly" resolve the basal clades of a tree (based on congruence with the reference tree constructed using all 100 taxa and 7990 characters). By itself, increased character sampling was an inefficient method by which to decrease the likelihood of "incorrect" resolution (i.e., incongruence with the reference tree) for parsimony analyses. Although increased taxon sampling was a powerful approach to alleviate "incorrect" resolution for parsimony analyses, it had the general effect of increasing the number of, and support for, "incorrectly" resolved clades in the Bayesian analyses. For both the parsimony and Bayesian analyses, increased taxon sampling, by itself, was insufficient to help resolve the basal clades, making this sampling strategy ineffective for that purpose. For this empirical study, the most efficient of the six approaches considered to resolve the basal clades when adding nucleotides to a dataset that consists of a single gene sampled for a small, but representative, number of taxa, is to increase character sampling and analyze the characters using the Bayesian method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号