首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Since branch lengths provide important information about the timing and the extent of evolutionary divergence among taxa, accurate resolution of evolutionary history depends as much on branch length estimates as on recovery of the correct topology. However, the empirical relationship between the choice of genes to sequence and the quality of branch length estimation remains ill defined. To address this issue, we evaluated the accuracy of branch lengths estimated from subsets of the mitochondrial genome for a mammalian phylogeny with known subordinal relationships. Using maximum-likelihood methods, we estimated branch lengths from an 11-kb sequence of all 13 protein-coding genes and compared them with estimates from single genes (0.2-1.8 kb) and from 7 different combinations of genes (2-3.5 kb). For each sequence, we separated the component of the log-likelihood deviation due to branch length differences associated with alternative topologies from that due to those that are independent of the topology. Even among the sequences that recovered the same tree topology, some produced significantly better branch length estimates than others did. The combination of correct topology and significantly better branch length estimation suggests that these gene combinations may prove useful in estimating phylogenetic relationships for mammalian divergences below the ordinal level. Thus, the proper choice of genes to sequence is a critical factor for reliable estimation of evolutionary history from molecular data.  相似文献   

2.
Yang Z 《Systematic biology》1998,47(1):125-133
The effect of the evolutionary rate of a gene on the accuracy of phylogeny reconstruction was examined by computer stimulation. The evolutionary rate is measured by the tree length, that is, the expected total number of nucleotide substitutions per site on the phylogeny. DNA sequence data were simulated using both fixed trees with specified branch lengths and random trees with branch lengths generated from a model of cladogenesis. The parsimony and likelihood methods were used for phylogeny reconstruction, and the proportion of correctly recovered branch partitions by each method was estimated. Phylogenetic methods including parsimony appear quite tolerant of multiple substitutions at the same site. The optimum levels of sequence divergence were even higher than upper limits previously suggested for saturation of substitutions, indicating that the problem of saturation may have been exaggerated. Instead, the lack of information at low levels of divergence should be seriously considered in evaluation of a gene's phylogenetic utility, especially when the gene sequence is short. The performance of parsimony, relative to that of likelihood, does not necessarily decrease with the increase of the evolutionary rate.  相似文献   

3.
Intraspecific variation is abundant in all types of systematic characters but is rarely addressed in simulation studies of phylogenetic method performance. We compared the accuracy of 15 phylogenetic methods using simulations to (1) determine the most accurate method(s) for analyzing polymorphic data (under simplified conditions) and (2) test if generalizations about the performance of phylogenetic methods based on previous simulations of fixed (nonpolymorphic) characters are robust to a very different evolutionary model that explicitly includes intraspecific variation. Simulated data sets consisted of allele frequencies that evolved by genetic drift. The phylogenetic methods included eight parsimony coding methods, continuous maximum likelihood, and three distance methods (UPGMA, neighbor joining, and Fitch-Margoliash) applied to two genetic distance measures (Nei's and the modified Cavalli-Sforza and Edwards chord distance). Two sets of simulations were performed. The first examined the effects of different branch lengths, sample sizes (individuals sampled per species), numbers of characters, and numbers of alleles per locus in the eight-taxon case. The second examined more extensively the effects of branch length in the four-taxon, two-allele case. Overall, the most accurate methods were likelihood, the additive distance methods (neighbor joining and Fitch-Margoliash), and the frequency parsimony method. Despite the use of a very different evolutionary model in the present article, many of the results are similar to those from simulations of fixed characters. Similarities include the presence of the "Felsenstein zone," where methods often fail, which suggests that long-branch attraction may occur among closely related species through genetic drift. Differences between the results of fixed and polymorphic data simulations include the following: (1) UPGMA is as accurate or more accurate than nonfrequency parsimony methods across nearly all combinations of branch lengths, and (2) likelihood and the additive distance methods are not positively misled under any combination of branch lengths tested (even when the assumptions of the methods are violated and few characters are sampled). We found that sample size is an important determinant of accuracy and affects the relative success of methods (i.e., distance and likelihood methods outperform parsimony at small sample sizes). Attempts to generalize about the behavior of phylogenetic methods should consider the extreme examples offered by fixed-mutation models of DNA sequence data and genetic-drift models of allele frequencies.  相似文献   

4.
Long branches in a true phylogeny tend to disrupt hierarchical character covariation (phylogenetic signal) in the distribution of traits among organisms. The distortion of hierarchical structure in character-state matrices can lead to errors in the estimation of phylogenetic relationships and inconsistency of methods of phylogenetic inference. Examination of trees distorted by long-branch attraction will not reveal the identities of problematic taxa, in part because the distortion can mask long branches by reducing inferred branch lengths and through errors in branching order. Here we present a simple method for the detection of taxa whose placement in evolutionary trees is made difficult by the effects of long-branch attraction. The method is an extension of a tree-independent conceptual framework of phylogenetic data exploration (RASA). Taxa that are likely to attract are revealed because long branches leave distinct footprints in the distribution of character states among taxa, and these traces can be directly observed in the error structure of the RASA regression. Problematic taxa are identified using a new diagnostic plot called the taxon variance plot, in which the apparent cladistic and phenetic variances contributed by individual taxa are compared. The procedure for identifying long edges employs algorithms solved in polynomial time and can be applied to morphological, molecular, and mixed characters. The efficacy of the method is demonstrated using simulated evolution and empirical evidence of long branches in a set of recently published sequences. We show that the accuracy of evolutionary trees can be improved by detecting and combating the potentially misleading influences of long-branch taxa.  相似文献   

5.
To elucidate potential ecological and evolutionary processes associated with the assembly of plant communities, there is now widespread use of estimates of phylogenetic diversity that are based on a variety of DNA barcode regions and phylogenetic construction methods. However, relatively few studies consider how estimates of phylogenetic diversity may be influenced by single DNA barcodes incorporated into a sequence matrix (conservative regions vs. hypervariable regions) and the use of a backbone family‐level phylogeny. Here, we use general linear mixed‐effects models to examine the influence of different combinations of core DNA barcodes (rbcL, matK, ITS, and ITS2) and phylogeny construction methods on a series of estimates of community phylogenetic diversity for two subtropical forest plots in Guangdong, southern China. We ask: (a) What are the relative influences of single DNA barcodes on estimates phylogenetic diversity metrics? and (b) What is the effect of using a backbone family‐level phylogeny to estimate topology‐based phylogenetic diversity metrics? The combination of more than one barcode (i.e., rbcL + matK + ITS) and the use of a backbone family‐level phylogeny provided the most parsimonious explanation of variation in estimates of phylogenetic diversity. The use of a backbone family‐level phylogeny showed a stronger effect on phylogenetic diversity metrics that are based on tree topology compared to those that are based on branch lengths. In addition, the variation in the estimates of phylogenetic diversity that was explained by the top‐rank models ranged from 0.1% to 31% and was dependent on the type of phylogenetic community structure metric. Our study underscores the importance of incorporating a multilocus DNA barcode and the use of a backbone family‐level phylogeny to infer phylogenetic diversity, where the type of DNA barcode employed and the phylogenetic construction method used can serve as a significant source of variation in estimates of phylogenetic community structure.  相似文献   

6.
Quartet-mapping, a generalization of the likelihood-mapping procedure.   总被引:5,自引:0,他引:5  
Likelihood-mapping (LM) was suggested as a method of displaying the phylogenetic content of an alignment. However, statistical properties of the method have not been studied. Here we analyze the special case of a four-species tree generated under a range of evolution models and compare the results with those of a natural extension of the likelihood-mapping approach, geometry-mapping (GM), which is based on the method of statistical geometry in sequence space. The methods are compared in their abilities to indicate the correct topology. The performance of both methods in detecting the star topology is especially explored. Our results show that LM tends to reject a star tree more often than GM. When assumptions about the evolutionary model of the maximum-likelihood reconstruction are not matched by the true process of evolution, then LM shows a tendency to favor one tree, whereas GM correctly detects the star tree except for very short outer branch lengths with a statistical significance of >0.95 for all models. LM, on the other hand, reconstructs the correct bifurcating tree with a probability of >0.95 for most branch length combinations even under models with varying substitution rates. The parameter domain for which GM recovers the true tree is much smaller. When the exterior branch lengths are larger than a (analytically derived) threshold value depending on the tree shape (rather than the evolutionary model), GM reconstructs a star tree rather than the true tree. We suggest a combined approach of LM and GM for the evaluation of starlike trees. This approach offers the possibility of testing for significant positive interior branch lengths without extensive statistical and computational efforts.  相似文献   

7.
Although long-branch attraction (LBA) is frequently cited as the cause of anomalous phylogenetic groupings, few examples of LBA involving real sequence data are known. We have found several cases of probable LBA by analyzing subsamples from an alignment of 18S rDNA sequences for 133 metazoans. In one example, maximum parsimony analysis of sequences from two rotifers, a ctenophore, and a polychaete annelid resulted in strong support for a tree grouping two "long-branch taxa" (a rotifer and the ctenophore). Maximum-likelihood analysis of the same sequences yielded strong support for a more biologically reasonable "rotifer monophyly" tree. Attempts to break up long branches for problematic subsamples through increased taxon sampling reduced, but did not eliminate, LBA problems. Exhaustive analyses of all quartets for a subset of 50 sequences were performed in order to compare the performance of maximum likelihood, equal-weights parsimony, and two additional variants of parsimony; these methods do differ substantially in their rates of failure to recover trees consistent with well established, but highly unresolved phylogenies. Power analyses using simulations suggest that some incorrect inferences by maximum parsimony are due to statistical inconsistency and that when estimates of central branch lengths for certain quartets are very low, maximum-likelihood analyses have difficulty recovering accepted phylogenies even with large amounts of data. These examples demonstrate that LBA problems can occur in real data sets, and they provide an opportunity to investigate causes of incorrect inferences.  相似文献   

8.
Recent studies have observed that Bayesian analyses of sequence data sets using the program MrBayes sometimes generate extremely large branch lengths, with posterior credibility intervals for the tree length (sum of branch lengths) excluding the maximum likelihood estimates. Suggested explanations for this phenomenon include the existence of multiple local peaks in the posterior, lack of convergence of the chain in the tail of the posterior, mixing problems, and misspecified priors on branch lengths. Here, we analyze the behavior of Bayesian Markov chain Monte Carlo algorithms when the chain is in the tail of the posterior distribution and note that all these phenomena can occur. In Bayesian phylogenetics, the likelihood function approaches a constant instead of zero when the branch lengths increase to infinity. The flat tail of the likelihood can cause poor mixing and undue influence of the prior. We suggest that the main cause of the extreme branch length estimates produced in many Bayesian analyses is the poor choice of a default prior on branch lengths in current Bayesian phylogenetic programs. The default prior in MrBayes assigns independent and identical distributions to branch lengths, imposing strong (and unreasonable) assumptions about the tree length. The problem is exacerbated by the strong correlation between the branch lengths and parameters in models of variable rates among sites or among site partitions. To resolve the problem, we suggest two multivariate priors for the branch lengths (called compound Dirichlet priors) that are fairly diffuse and demonstrate their utility in the special case of branch length estimation on a star phylogeny. Our analysis highlights the need for careful thought in the specification of high-dimensional priors in Bayesian analyses.  相似文献   

9.
The maximum-likelihood (ML) solution to a simple phylogenetic estimation problem is obtained analytically The problem is estimation of the rooted tree for three species using binary characters with a symmetrical rate of substitution under the molecular clock. ML estimates of branch lengths and log-likelihood scores are obtained analytically for each of the three rooted binary trees. Estimation of the tree topology is equivalent to partitioning the sample space (space of possible data outcomes) into subspaces, within each of which one of the three binary trees is the ML tree. Distance-based least squares and parsimony-like methods produce essentially the same estimate of the tree topology, although differences exist among methods even under this simple model. This seems to be the simplest case, but has many of the conceptual and statistical complexities involved in phylogeny estimation. The solution to this real phylogeny estimation problem will be useful for studying the problem of significance evaluation.  相似文献   

10.
While there has been strong support for Amborella and Nymphaeales (water lilies) as branching from basal-most nodes in the angiosperm phylogeny, this hypothesis has recently been challenged by phylogenetic analyses of 61 protein-coding genes extracted from the chloroplast genome sequences of Amborella, Nymphaea, and 12 other available land plant chloroplast genomes. These character-rich analyses placed the monocots, represented by three grasses (Poaceae), as sister to all other extant angiosperm lineages. We have extracted protein-coding regions from draft sequences for six additional chloroplast genomes to test whether this surprising result could be an artifact of long-branch attraction due to limited taxon sampling. The added taxa include three monocots (Acorus, Yucca, and Typha), a water lily (Nuphar), a ranunculid (Ranunculus), and a gymnosperm (Ginkgo). Phylogenetic analyses of the expanded DNA and protein data sets together with microstructural characters (indels) provided unambiguous support for Amborella and the Nymphaeales as branching from the basal-most nodes in the angiosperm phylogeny. However, their relative positions proved to be dependent on the method of analysis, with parsimony favoring Amborella as sister to all other angiosperms and maximum likelihood (ML) and neighbor-joining methods favoring an Amborella + Nymphaeales clade as sister. The ML phylogeny supported the later hypothesis, but the likelihood for the former hypothesis was not significantly different. Parametric bootstrap analysis, single-gene phylogenies, estimated divergence dates, and conflicting indel characters all help to illuminate the nature of the conflict in resolution of the most basal nodes in the angiosperm phylogeny. Molecular dating analyses provided median age estimates of 161 MYA for the most recent common ancestor (MRCA) of all extant angiosperms and 145 MYA for the MRCA of monocots, magnoliids, and eudicots. Whereas long sequences reduce variance in branch lengths and molecular dating estimates, the impact of improved taxon sampling on the rooting of the angiosperm phylogeny together with the results of parametric bootstrap analyses demonstrate how long-branch attraction might mislead genome-scale phylogenetic analyses.  相似文献   

11.
In this paper we use hypothetical and empirical data matrices to evaluate the ability of relative apparent synapomorphy analysis (RASA) to measure phylogenetic signal, select outgroups, and identify terminals subject to long-branch attraction. In all cases, except for equal character-state frequencies, RASA indicated extraordinarily high levels of phylogenetic information for hypothetical data matrices that are uninformative regarding relationships among the terminals. Yet, regardless of the number of characters or character-state frequencies, RASA failed to detect phylogenetic signal for hypothetical matrices with strong phylogenetic signal. In our empirical example, RASA indicated increasing phylogenetic signal for matrices for which the strict consensus of the most parsimonious trees is increasingly poorly resolved, clades are increasingly poorly supported, and for which many relationships are in conflict with more widely sampled analyses. RASA is an ineffective approach to identify outgroup terminal(s) with the most plesiomorphic character states for the ingroup. Our hypothetical example demonstrated that RASA preferred outgroup terminals with increasing numbers of convergent character states with ingroup terminals, and rejected the outgroup terminal with all plesiomorphic character states. Our empirical example demonstrated that RASA, in all three cases examined, selected an ingroup terminal, rather than an outgroup terminal, as the best outgroup. In no case was one of the two outgroup terminals even close to being considered the optimal outgroup by RASA. RASA is an ineffective means of identifying problematic long-branch terminals. In our hypothetical example, RASA indicated a terminal as being a problematic long-branch terminal in spite of the terminal being on a zero-length branch and having no possibility of undergoing long-branch attraction with another terminal. RASA also failed to identify actual problematic long-branch terminals that did undergo long-branch attraction, but only after following Lyons-Weiler and Hoelzer's (1997) three-step process to identify and remove terminals subject to long-branch attraction. We conclude that RASA should not be used for any of these purposes.  相似文献   

12.
Planipapillus, a clade of onychophorans from southeastern Australia, exhibits substantial chromosomal variation. In the context of a robust phylogeny based on nuclear and mitochondrial sequence data, we evaluate models of chromosomal evolution and speciation that differ in the roles assigned to selection, mutation, and drift. Permutation tests suggest that all chromosome rearrangements in the clade have been centric fusions and, on the basis of parsimony and maximum-likelihood methods with independent estimates of branch lengths, we conclude that at least 31 centric fusions have been fixed in Planipapillus. A likelihood-ratio test approach, which is independent of our point estimates of ancestral states, rejects an evolutionary model in which the mutation rate is constant and centric fusions are effectively neutral. In contrast to the nucleotide sequence data, which are consistent with neutrality and rate constancy, centric fusions in Planipapillus are underdominant, spontaneous fusion rates vary among lineages, or both. We predict an inverse relationship between rates of chromosomal evolution and historical population size. Chromosomal evolution may play a role in speciation in Planipapillus, both by interactions between centric fusions with monobrachial homology and by the accumulation of multiple weakly underdominant fusions.  相似文献   

13.
Long-branch attraction is a well-known source of systematic error that can mislead phylogenetic methods; it is frequently invoked post hoc, upon recovering a different tree from the one expected based on prior evidence. We demonstrate that methods that do not force the data onto a single tree, such as spectral analysis, Neighbor-Net, and consensus networks, can be used to detect conflicting signals within the data, including those caused by long-branch attraction. We illustrate this approach using a set of taxa from three unambiguously monophyletic families within the Pelecaniformes: the darters, the cormorants and shags, and the gannets and boobies. These three families are universally acknowledged as forming a monophyletic group, but the relationship between the families remains contentious. Using sequence data from three mitochondrial genes (12S, ATPase 6, and ATPase 8) we demonstrate that the relationship between these three families is difficult to resolve because they are separated by a short internal branch and there are conflicting signals due to long-branch attraction, which are confounded with nonhomogeneous sequence evolution across the different genes. Spectral analysis, Neighbor-Net, and consensus networks reveal conflicting signals regarding the placement of one of the darters, with support found for darter monophyly, but also support for a conflicting grouping with the outgroup, pelicans. Furthermore, parsimony and maximum-likelihood analyses produced different trees, with one of the two most parsimonious trees not supporting the monophyly of the darters. Monte Carlo simulations, however, were not sensitive enough to reveal long-branch attraction unless the branches are longer than those actually observed. These results indicate that spectral analysis, Neighbor-Net, and consensus networks offer a powerful approach to detecting and understanding the source of conflicting signals within phylogenetic data.  相似文献   

14.
Currently available methods for model selection used in phylogenetic analysis are based on an initial fixed-tree topology. Once a model is picked based on this topology, a rigorous search of the tree space is run under that model to find the maximum-likelihood estimate of the tree (topology and branch lengths) and the maximum-likelihood estimates of the model parameters. In this paper, we propose two extensions to the decision-theoretic (DT) approach that relax the fixed-topology restriction. We also relax the fixed-topology restriction for the Bayesian information criterion (BIC) and the Akaike information criterion (AIC) methods. We compare the performance of the different methods (the relaxed, restricted, and the likelihood-ratio test [LRT]) using simulated data. This comparison is done by evaluating the relative complexity of the models resulting from each method and by comparing the performance of the chosen models in estimating the true tree. We also compare the methods relative to one another by measuring the closeness of the estimated trees corresponding to the different chosen models under these methods. We show that varying the topology does not have a major impact on model choice. We also show that the outcome of the two proposed extensions is identical and is comparable to that of the BIC, Extended-BIC, and DT. Hence, using the simpler methods in choosing a model for analyzing the data is more computationally feasible, with results comparable to the more computationally intensive methods. Another outcome of this study is that earlier conclusions about the DT approach are reinforced. That is, LRT, Extended-AIC, and AIC result in more complicated models that do not contribute to the performance of the phylogenetic inference, yet cause a significant increase in the time required for data analysis.  相似文献   

15.
An experimental phylogeny was constructed using bacteriophage T7 and a propagation protocol, in the presence of the mutagen N-methyl-N′-nitro-N′-nitrosoguanidine, based on Hillis et al. [Hillis, D.M., Bull, J.J., White, M.E., Badgett, M.R., Molineux, I.J., 1992. Experimental phylogenetics, generation of a known phylogeny. Science 255, 589–592]. The topology presented in this study has a considerable variation in branch lengths and is less symmetric than the one presented by Hillis et al. [Hillis, D.M., Bull, J.J., White, M.E., Badgett, M.R., Molineux, I.J., 1992. Experimental phylogenetics, generation of a known phylogeny. Science 255, 589–592]. These features are known to present additional difficulties to phylogenetic inference methods. The performance of several phylogenetic methods (conventional and less conventional) was tested using restriction site and nucleotide data. Only methods that encompassed a molecular clock or those based on sequence signatures recovered the true phylogeny. Nevertheless a likelihood ratio test rejected the hypothesis of the existence of a molecular clock when the whole sequence data set was considered. This fact or the particular substitution pattern (mainly G → A and C → T) may be related to the unexpected performance of distance methods based on sequence signatures. To test if the results could have been predicted by simulation studies we estimated the evolution parameters from the real phylogeny and used them to simulate evolution along the same tree (parametric bootstrap). We found that simulation could predict most but not all of the problems encountered by phylogenetic inference methods in the real phylogeny. Short interior branches may be more prone to error than predicted by theoretical studies.  相似文献   

16.
There is accumulating evidence that the general shape of the ribosomal DNA-based phylogeny of Eukaryotes is strongly biased by the long-branch attraction phenomenon, leading to an artifactual basal clustering of groups that are probably highly derived. Among these groups, Foraminifera are of particular interest, because their deep phylogenetic position in ribosomal trees contrasts with their Cambrian appearance in the fossil record. A recent actin-based phylogeny of Eukaryotes has proposed that Foraminifera might be closely related to Cercozoa and, thus, branch among the so-called crown of Eukaryotes. Here, we reanalyze the small-subunit ribosomal RNA gene (SSU rDNA) phylogeny by removing all long-branching lineages that could artifactually attract foraminiferan sequences to the base of the tree. Our analyses reveal that Foraminifera branch together with the marine testate filosean Gromia oviformis as a sister group to Cercozoa, in agreement with actin phylogeny. Our study confirms the utility of SSU rDNA as a phylogenetic marker of megaevolutionary history, provided that the artifacts due to the heterogeneity of substitution rates in ribosomal genes are circumvented.  相似文献   

17.
Until recently, phylogenetic analyses have been routinely based on homologous sequences of a single gene. Given the vast number of gene sequences now available, phylogenetic studies are now based on the analysis of multiple genes. Thus, it has become necessary to devise statistical methods to combine multiple molecular data sets. Here, we compare several models for combining different genes for the purpose of evaluating the likelihood of tree topologies. Three methods of branch length estimation were studied: assuming all genes have the same branch lengths (concatenate model), assuming that branch lengths are proportional among genes (proportional model), or assuming that each gene has a separate set of branch lengths (separate model). We also compared three models of among-site rate variation: the homogenous model, a model that assumes one gamma parameter for all genes, and a model that assumes one gamma parameter for each gene. On the basis of two nuclear and one mitochondrial amino acid data sets, our results suggest that, depending on the data set chosen, either the separate model or the proportional model represents the most appropriate method for branch length analysis. For all the data sets examined, one gamma parameter for each gene represents the best model for among-site rate variation. Using these models we analyzed alternative mammalian tree topologies, and we describe the effect of the assumed model on the maximum likelihood tree. We show that the choice of the model has an impact on the best phylogeny obtained.  相似文献   

18.
The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/.  相似文献   

19.
Nuclear-encoded SSU rDNA sequences have been obtained from 64 strains of conjugating green algae (Zygnemophyceae, Streptophyta, Viridiplantae). Molecular phylogenetic analyses of 90 SSU rDNA sequences of Viridiplantae (inciuding 78 from the Zygnemophyceae) were performed using complex evolutionary models and maximum likelihood, distance, and maximum parsimony methods. The significance of the results was tested by bootstrap analyses, deletion of long-branch taxa, relative rate tests, and Kishino-Hasegawa tests with user-defined trees. All results support the monophyly of the class Zygnemophyceae and of the order Desmidiales. The second order, Zygnematales, forms a series of early-branching clades in paraphyletic succession, with the two traditional families Mesotaeniaceae and Zygnemataceae not recovered as lineages. Instead, a long-branch Spirogyra/Sirogonium clade and the later-diverging Netrium and Roya clades represent independent clades. Within the order Desmidiales, the families Gonatozygaceae and Closteriaceae are monophyletic, whereas the Peniaceae (represented only by Penium margaritaceum) and the Desmidiaceae represent a single weakly supported lineage. Within the Desmidiaceae short internal branches and varying rates of sequence evolution among taxa reduce the phylogenetic resolution significantly. The SSU rDNA-based phylogeny is largely congruent with a published analysis of the rbcL phylogeny of the Zygnemophyceae (McCourt et al. 2000) and is also in general agreement with classification schemes based on cell wall ultrastructure. The extended taxon sampling at the subgenus level provides solid evidence that many genera in the Zygnemophyceae are not monophyletic and that the genus concept in the group needs to be revised.  相似文献   

20.
Nucleotide sequences from a 434-bp region of the 16S rRNA gene were analyzed for 65 taxa of Hymenoptera (ants, bees, wasps, parasitoid wasps, sawflies) to examine the patterns of variation within the gene fragment and the taxonomic levels for which it shows maximum utility in phylogeny estimation. A hierarchical approach was adopted in the study through comparison of levels of sequence variation among taxa at different taxonomic levels. As previously reported for many holometabolous insects, the 16S data reported here for Hymenoptera are highly AT-rich and exhibit strong site-to-site variation in substitution rate. More precise estimates of the shape parameter (alpha) of the gamma distribution and the proportion of invariant sites were obtained in this study by employing a reference phylogeny and utilizing maximum-likelihood estimation. The effectiveness of this approach to recovering expected phylogenies of selected hymenopteran taxa has been tested against the use of maximum parsimony. This study finds that the 16S gene is most informative for phylogenetic analysis at two different levels: among closely related species or populations, and among tribes, subfamilies, and families. Maximization of the phylogenetic signal extracted from the 16S gene at higher taxonomic levels may require consideration of the base composition bias and the site-to-site rate variation in a maximum-likelihood framework.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号