首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Although long-branch attraction, the incorrect grouping of long lineages in a phylogeny because of systematic error, has been identified as a potential source of error in phylogenetic analysis for almost two decades, no empirical examples of the phenomenon exist. Here, I outline several criteria for identifying long-branch attraction and apply these criteria to 18S ribosomal DNA (rDNA) sequence data for 13 insects. Parsimony and minimum evolution with p distances group the two longest branches together (those leading to Strepsiptera and Diptera). Simulation studies show that the long branches are long enough to attract. When a tree is assumed in which Strepsiptera and Diptera are separated and many data sets are simulated for that tree (using the parameter estimates for that tree for the original data), parsimony analysis of the simulated data consistently groups Strepsiptera and Diptera. Analyses of the 18S rDNA sequences using methods that are less sensitive to the problem of long-branch attraction estimate trees in which the long branches are separate.  相似文献   

2.
The accuracy of phylogenetic methods is reinvestigated for the four-taxon case with a two-edge rate and a three-edge rate. Unlike previous studies involving computer simulations, the two-edge rate relates to branches that are sister taxa in the model tree. As with previous studies, certain methods are found to behave inaccurately in a portion of the parameter space where the two-edge rate is proportionally large. This phenomenon, to which parsimony is immune, is termed “long-branch repulsion” and the region of poor performance is called the Farris Zone. Maximum likelihood methods are shown to be particularly prone to failure when closely related taxa have long branches. Long-branch repulsion is demonstrated with an empirical case involving Strepsiptera and Diptera.  相似文献   

3.
Although long-branch attraction (LBA) is frequently cited as the cause of anomalous phylogenetic groupings, few examples of LBA involving real sequence data are known. We have found several cases of probable LBA by analyzing subsamples from an alignment of 18S rDNA sequences for 133 metazoans. In one example, maximum parsimony analysis of sequences from two rotifers, a ctenophore, and a polychaete annelid resulted in strong support for a tree grouping two "long-branch taxa" (a rotifer and the ctenophore). Maximum-likelihood analysis of the same sequences yielded strong support for a more biologically reasonable "rotifer monophyly" tree. Attempts to break up long branches for problematic subsamples through increased taxon sampling reduced, but did not eliminate, LBA problems. Exhaustive analyses of all quartets for a subset of 50 sequences were performed in order to compare the performance of maximum likelihood, equal-weights parsimony, and two additional variants of parsimony; these methods do differ substantially in their rates of failure to recover trees consistent with well established, but highly unresolved phylogenies. Power analyses using simulations suggest that some incorrect inferences by maximum parsimony are due to statistical inconsistency and that when estimates of central branch lengths for certain quartets are very low, maximum-likelihood analyses have difficulty recovering accepted phylogenies even with large amounts of data. These examples demonstrate that LBA problems can occur in real data sets, and they provide an opportunity to investigate causes of incorrect inferences.  相似文献   

4.
Long-Branch Abstractions   总被引:11,自引:1,他引:11  
Recent attention has been focused on the sensitivities of various tree reconstructing algorithms to sequence rate heterogeneity (long-branch attraction). Phylogenetic conclusions from two recent empirical studies have been indicted as artifacts attributable to long-branch attraction. Siddall et al. (1995) concluded that Myxozoa are cnidarians and sister group to Polypodium based on 18S rDNA and morphology. Hanelt et al. (1996) argued that this result is due to long-branch attraction. Whiting et al. (1997) concluded that the Strepsiptera are sister group to Diptera based on parsimony analysis of 18S rDNA, 28S rDNA, and morphology. Huelsenbeck (1997) argued that this result also is attributable to long-branch attraction. We demonstrate that the analyses and arguments dismissing these results as the effects of long-branch attraction are fundamentally flawed. The criteria employed by these authors were applied arbitrarily by them to the groups that they did not want, and yet using those same criteria, there is more reason to exclude other taxa besides Polypodium and there is more reason to disbelieve monophyly of Diptera than monophyly of Strepsiptera with Diptera. Moreover, it is asserted, long-branch attraction cannot explain the presence of nematocysts in Myxozoa and halteres in Strepsiptera. For these reasons, and in light of the demonstration that long branches cannot attract each other in their mutual absence, we conclude that the monophyly of Myxozoa + Polypodium and Strepsiptera + Diptera is not due to long-branch attraction. We suggest that maximum likelihood methods are extremely sensitive to taxon and character sampling and that these data sets are demonstrative of the long-branch repulsion problem.  相似文献   

5.
Taxon sampling may be critically important for phylogenetic accuracy because adding taxa can help to subdivide misleading long branches. Although the idea that added taxa can break up long branches was exemplified by a study of "incomplete" fossil taxa, the issue of taxon completeness (i.e., proportion of missing data) has been largely ignored in most subsequent discussions of taxon sampling and long-branch attraction. In this article, I use simulations to test the ability of incomplete taxa to subdivide long branches and improve phylogenetic accuracy in situations of potential long-branch attraction. The results show that for most methods and conditions examined, adding taxa that are only 50% complete may provide similar benefits to adding the same number of complete taxa (suggesting that the advantages of increased taxon sampling may be obtained with less data than previously considered). For parsimony, taxa that are less complete (5% to 25% complete) may often have limited ability to rescue analyses from long-branch attraction. In contrast, highly incomplete taxa can be surprisingly beneficial when using model-based methods. The results also suggest the importance of model-based methods in phylogenetic analyses that combine molecular and fossil data.  相似文献   

6.
Felsenstein (1978, Syst. Zool. 27:401-410) showed that the method of maximum parsimony can be inconsistent, i.e., lead to an incorrect result with an infinite amount of data. The situation in which this inconsistency occurs is often called the "Felsenstein zone," the phenomenon also known as "long-branch attraction." Felsenstein derived a sufficient inconsistency condition from a model for four taxa with only two different parameters for the probability of change on the five branches connecting the four taxa. In the present paper, his approach is used to derive the inconsistency condition of maximum parsimony from the most general model for four taxa, i.e., with five different parameters for the probabilities of change on the five branches and, for the first time, for characters with k states (k = 2, 3, 4, 5, 6, ...) This is used to determine the factors that can cause the inconsistency of maximum parsimony. It is shown that the probability of change on all five branches and the number of character states play a role in causing inconsistency.  相似文献   

7.
The behavior of nodal support and stability in the presence of long branches were examined under simulations and an analysis of real data. Relatively short branches were typically correctly resolved, received high bootstrap support, and were stable in sensitivity analyses. Longer branches received lower support and stability measures, and were often incorrectly resolved due to the long-branch attraction. Support and stability does not always correlate, and in the case of mammalian mitochondrial tree, well supported but unstable nodes were typically associated with long-branch attraction. Very long branches, on the other hand, may be incorrectly resolved with high support and stability indices. These patterns were observed both in simulations, and in the real data. The results indicate that sensitivity analysis may help to reveal phylogenetic uncertainty hidden behind artificially high support.  相似文献   

8.
Long branches in a true phylogeny tend to disrupt hierarchical character covariation (phylogenetic signal) in the distribution of traits among organisms. The distortion of hierarchical structure in character-state matrices can lead to errors in the estimation of phylogenetic relationships and inconsistency of methods of phylogenetic inference. Examination of trees distorted by long-branch attraction will not reveal the identities of problematic taxa, in part because the distortion can mask long branches by reducing inferred branch lengths and through errors in branching order. Here we present a simple method for the detection of taxa whose placement in evolutionary trees is made difficult by the effects of long-branch attraction. The method is an extension of a tree-independent conceptual framework of phylogenetic data exploration (RASA). Taxa that are likely to attract are revealed because long branches leave distinct footprints in the distribution of character states among taxa, and these traces can be directly observed in the error structure of the RASA regression. Problematic taxa are identified using a new diagnostic plot called the taxon variance plot, in which the apparent cladistic and phenetic variances contributed by individual taxa are compared. The procedure for identifying long edges employs algorithms solved in polynomial time and can be applied to morphological, molecular, and mixed characters. The efficacy of the method is demonstrated using simulated evolution and empirical evidence of long branches in a set of recently published sequences. We show that the accuracy of evolutionary trees can be improved by detecting and combating the potentially misleading influences of long-branch taxa.  相似文献   

9.
Recent studies based on different types of data (i.e., morphology, molecules) have found strongly conflicting phylogenies for the genera of iguanid lizards but have been unable to explain the basis for this incongruence. We reanalyze published data from morphology and from the mitochondrial ND4, cytochrome b, 12S, and 16S genes to explore the sources of incongruence and resolve these conflicts. Much of the incongruence centers on the genus Cyclura, which is the sister taxon of Iguana, according to parsimony analyses of the morphology and the ribosomal genes, but is the sister taxon of all other Iguanini, according to the protein-coding genes. Maximum likelihood analyses show that there has been an increase in the rate of nucleotide substitution in Cyclura in the two protein-coding genes (ND4 and cytochrome b), although this increase is not as clear when parsimony is used to estimate branch lengths. Parametric simulations suggest that Cyclura may be misplaced by the protein-coding genes as a result of long-branch attraction; even when Cyclura and Iguana are sister taxa in a simulated phylogeny, Cyclura is still placed as the basal member of the Iguanini by parsimony analysis in 55% of the replicates. A similar long-branch attraction problem may also exist in the morphological data with regard to the placement of Sauromalus with the Galápagos iguanas (Amblyrhynchus and Conolophus). The results have many implications for the analysis of diverse data sets, the impact of long branches on parsimony and likelihood methods, and the use of certain protein-coding genes in phylogeny reconstruction.  相似文献   

10.
Intraspecific variation is abundant in all types of systematic characters but is rarely addressed in simulation studies of phylogenetic method performance. We compared the accuracy of 15 phylogenetic methods using simulations to (1) determine the most accurate method(s) for analyzing polymorphic data (under simplified conditions) and (2) test if generalizations about the performance of phylogenetic methods based on previous simulations of fixed (nonpolymorphic) characters are robust to a very different evolutionary model that explicitly includes intraspecific variation. Simulated data sets consisted of allele frequencies that evolved by genetic drift. The phylogenetic methods included eight parsimony coding methods, continuous maximum likelihood, and three distance methods (UPGMA, neighbor joining, and Fitch-Margoliash) applied to two genetic distance measures (Nei's and the modified Cavalli-Sforza and Edwards chord distance). Two sets of simulations were performed. The first examined the effects of different branch lengths, sample sizes (individuals sampled per species), numbers of characters, and numbers of alleles per locus in the eight-taxon case. The second examined more extensively the effects of branch length in the four-taxon, two-allele case. Overall, the most accurate methods were likelihood, the additive distance methods (neighbor joining and Fitch-Margoliash), and the frequency parsimony method. Despite the use of a very different evolutionary model in the present article, many of the results are similar to those from simulations of fixed characters. Similarities include the presence of the "Felsenstein zone," where methods often fail, which suggests that long-branch attraction may occur among closely related species through genetic drift. Differences between the results of fixed and polymorphic data simulations include the following: (1) UPGMA is as accurate or more accurate than nonfrequency parsimony methods across nearly all combinations of branch lengths, and (2) likelihood and the additive distance methods are not positively misled under any combination of branch lengths tested (even when the assumptions of the methods are violated and few characters are sampled). We found that sample size is an important determinant of accuracy and affects the relative success of methods (i.e., distance and likelihood methods outperform parsimony at small sample sizes). Attempts to generalize about the behavior of phylogenetic methods should consider the extreme examples offered by fixed-mutation models of DNA sequence data and genetic-drift models of allele frequencies.  相似文献   

11.
Maximum likelihood and maximum parsimony are two key methods for phylogenetic tree reconstruction. Under certain conditions, each of these two methods can perform more or less efficiently, resulting in unresolved or disputed phylogenies. We show that a neural network can distinguish between four-taxon alignments that were evolved under conditions susceptible to either long-branch attraction or long-branch repulsion. When likelihood and parsimony methods are discordant, the neural network can provide insight as to which tree reconstruction method is best suited to the alignment. When applied to the contentious case of Strepsiptera evolution, our method shows robust support for the current scientific view, that is, it places Strepsiptera with beetles, distant from flies.  相似文献   

12.
Long-branch attraction is a well-known source of systematic error that can mislead phylogenetic methods; it is frequently invoked post hoc, upon recovering a different tree from the one expected based on prior evidence. We demonstrate that methods that do not force the data onto a single tree, such as spectral analysis, Neighbor-Net, and consensus networks, can be used to detect conflicting signals within the data, including those caused by long-branch attraction. We illustrate this approach using a set of taxa from three unambiguously monophyletic families within the Pelecaniformes: the darters, the cormorants and shags, and the gannets and boobies. These three families are universally acknowledged as forming a monophyletic group, but the relationship between the families remains contentious. Using sequence data from three mitochondrial genes (12S, ATPase 6, and ATPase 8) we demonstrate that the relationship between these three families is difficult to resolve because they are separated by a short internal branch and there are conflicting signals due to long-branch attraction, which are confounded with nonhomogeneous sequence evolution across the different genes. Spectral analysis, Neighbor-Net, and consensus networks reveal conflicting signals regarding the placement of one of the darters, with support found for darter monophyly, but also support for a conflicting grouping with the outgroup, pelicans. Furthermore, parsimony and maximum-likelihood analyses produced different trees, with one of the two most parsimonious trees not supporting the monophyly of the darters. Monte Carlo simulations, however, were not sensitive enough to reveal long-branch attraction unless the branches are longer than those actually observed. These results indicate that spectral analysis, Neighbor-Net, and consensus networks offer a powerful approach to detecting and understanding the source of conflicting signals within phylogenetic data.  相似文献   

13.
On RASA   总被引:2,自引:1,他引:1  
Relative Apparent Synapomorphy Analysis (RASA) was recently proposed as a way to measure phylogenetic signal, choose "optimal" outgroups, find long branches, and eliminate long-branch attraction. In this paper it is shown with simple examples that RASA has several problems. The null regression model used by RASA to measure phylogenetic signal does not have a straightforward relation to phylogenetic information. RASA detects long branches, but does not discriminate between long branches that mislead an analysis and those that do not. Rooted RASA, which is used for "optimal outgroup analysis," is shown to be an inappropriate measure of "+esiomorphy content".  相似文献   

14.
Sequences of two chloroplast photosystem genes, psaA and psbB, together comprising about 3,500 bp, were obtained for all five major groups of extant seed plants and several outgroups among other vascular plants. Strongly supported, but significantly conflicting, phylogenetic signals were obtained in parsimony analyses from partitions of the data into first and second codon positions versus third positions. In the former, both genes agreed on a monophyletic gymnosperms, with Gnetales closely related to certain conifers. In the latter, Gnetales are inferred to be the sister group of all other seed plants, with gymnosperms paraphyletic. None of the data supported the modern "anthophyte hypothesis," which places Gnetales as the sister group of flowering plants. A series of simulation studies were undertaken to examine the error rate for parsimony inference. Three kinds of errors were examined: random error, systematic bias (both properties of finite data sets), and statistical inconsistency owing to long-branch attraction (an asymptotic property). Parsimony reconstructions were extremely biased for third-position data for psbB. Regardless of the true underlying tree, a tree in which Gnetales are sister to all other seed plants was likely to be reconstructed for these data. None of the combinations of genes or partitions permits the anthophyte tree to be reconstructed with high probability. Simulations of progressively larger data sets indicate the existence of long-branch attraction (statistical inconsistency) for third-position psbB data if either the anthophyte tree or the gymnosperm tree is correct. This is also true for the anthophyte tree using either psaA third positions or psbB first and second positions. A factor contributing to bias and inconsistency is extremely short branches at the base of the seed plant radiation, coupled with extremely high rates in Gnetales and nonseed plant outgroups.  相似文献   

15.
载脂蛋白多基因家族分子进化的研究   总被引:2,自引:2,他引:0  
王乐  柴建华 《遗传学报》1994,21(2):81-95
与脂质运输有关的载脂蛋白基因构成一个复杂的多基因家族。为探讨这种演化时间长的基因家族的进化规律,本文首先建立了一种在非均衡进化速率条件下计算系统发生树中任意分支长度的简易方法,并可在此基础上算出无根分支系统树中分歧年代的期望值。进一步对本文科10个种属共26种载脂蛋白的系统演作作了实际分析,结果提示:①ApoA-I'ApoA-IV,ApoE及ApoA-II的共同祖先可能在奥陶纪水生脊椎动物中就已存  相似文献   

16.
Phylogenetic analyses were applied to 269 families of putative orthologs represented by a single member in the genomes of human, mouse, dog, and chicken. Five methods were used: maximum parsimony (NP), neighbor-joining (NJ) with Poisson and Gamma distances; and maximum likelihood (ML) with JTT and JTT+gamma models. When applied to the concatenated sequence of all families, all methods strongly supported a tree in which mouse branched before human and dog. In analyses of individual families, the same topology was supported more than any other. Although there was evidence of an increased rate of amino acid replacement in the mouse lineage in comparison to the other two mammals, there was no evidence that support for the mouse's basal position was due to long-branch attraction; rather, this topology was seen in the families with the lowest rate variation among the three mammalian branches. In families with highly divergent mouse sequences, ML with both JTT and JTT+gamma and NJ with the gamma distance tended to support a topology in which the dog, rather than the mouse, branched first. Thus, in these data, a tendency of long and short branches to cluster together ("opposite-branch attraction") seemed to be more of a problem than long-branch attraction.  相似文献   

17.
Nuclear-encoded SSU rDNA sequences have been obtained from 64 strains of conjugating green algae (Zygnemophyceae, Streptophyta, Viridiplantae). Molecular phylogenetic analyses of 90 SSU rDNA sequences of Viridiplantae (inciuding 78 from the Zygnemophyceae) were performed using complex evolutionary models and maximum likelihood, distance, and maximum parsimony methods. The significance of the results was tested by bootstrap analyses, deletion of long-branch taxa, relative rate tests, and Kishino-Hasegawa tests with user-defined trees. All results support the monophyly of the class Zygnemophyceae and of the order Desmidiales. The second order, Zygnematales, forms a series of early-branching clades in paraphyletic succession, with the two traditional families Mesotaeniaceae and Zygnemataceae not recovered as lineages. Instead, a long-branch Spirogyra/Sirogonium clade and the later-diverging Netrium and Roya clades represent independent clades. Within the order Desmidiales, the families Gonatozygaceae and Closteriaceae are monophyletic, whereas the Peniaceae (represented only by Penium margaritaceum) and the Desmidiaceae represent a single weakly supported lineage. Within the Desmidiaceae short internal branches and varying rates of sequence evolution among taxa reduce the phylogenetic resolution significantly. The SSU rDNA-based phylogeny is largely congruent with a published analysis of the rbcL phylogeny of the Zygnemophyceae (McCourt et al. 2000) and is also in general agreement with classification schemes based on cell wall ultrastructure. The extended taxon sampling at the subgenus level provides solid evidence that many genera in the Zygnemophyceae are not monophyletic and that the genus concept in the group needs to be revised.  相似文献   

18.
We introduce a distance-based phylogeny reconstruction method called "weighted neighbor joining," or "Weighbor" for short. As in neighbor joining, two taxa are joined in each iteration; however, the Weighbor criterion for choosing a pair of taxa to join takes into account that errors in distance estimates are exponentially larger for longer distances. The criterion embodies a likelihood function on the distances, which are modeled as correlated Gaussian random variables with different means and variances, computed under a probabilistic model for sequence evolution. The Weighbor criterion consists of two terms, an additivity term and a positivity term, that quantify the implications of joining the pair. The first term evaluates deviations from additivity of the implied external branches, while the second term evaluates confidence that the implied internal branch has a positive branch length. Compared with maximum-likelihood phylogeny reconstruction, Weighbor is much faster, while building trees that are qualitatively and quantitatively similar. Weighbor appears to be relatively immune to the "long branches attract" and "long branch distracts" drawbacks observed with neighbor joining, BIONJ, and parsimony.  相似文献   

19.
The effects on phylogenetic accuracy of adding characters and/or taxa were explored using data generated by computer simulation. The conditions of this study were constrained but allowed for systematic investigation of certain parameters. The starting point for the study was a four-taxon tree in the "Felsenstein zone," representing a difficult phylogenetic problem with an extreme situation of long branch attraction. Taxa were added sequentially to this tree in a manner specifically designed to break up the long branches, and for each tree data matrices of different sizes were simulated. Phylogenetic trees were reconstructed from these data using the criteria of parsimony and maximum likelihood. Phylogenetic accuracy was measured in three ways: (1) proportion of trees that are completely correct, (2) proportion of correctly reconstructed branches in all trees, and (3) proportion of trees in which the original four-taxon statement is correctly reconstructed. Accuracy improved dramatically with the addition of taxa and much more slowly with the addition of characters. If taxa can be added to break up long branches, it is much more preferable to add taxa than characters.  相似文献   

20.
Closely related outgroups are optimal for rooting phylogenetic trees; however, such ideal outgroups are not always available. A phylogeny of the marattioid ferns (Marattiaceae), an ancient lineage with no close relatives, was reconstructed using nucleotide sequences of multiple chloroplast regions (rps4 + rps4-trnS spacer, trnS-trnG spacer + trnG intron, rbcL, atpB), from 88 collections, selected to cover the broadest possible range of morphologies and geographic distributions within the extant taxa. Because marattioid ferns are phylogenetically isolated from other lineages, and internal branches are relatively short, rooting was problematic. Root placement was strongly affected by long-branch attraction under maximum parsimony and by model choice under maximum likelihood. A multifaceted approach to rooting was employed to isolate the sources of bias and produce a consensus root position. In a statistical comparison of all possible root positions with three different outgroups, most root positions were not significantly less optimal than the maximum likelihood root position, including the consensus root position. This phylogeny has several important taxonomic implications for marattioid ferns: Marattia in the broad sense is paraphyletic; the Hawaiian endemic Marattia douglasii is most closely related to tropical American taxa; and Angiopteris is monophyletic only if Archangiopteris and Macroglossum are included.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号