首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
The effect of missing data on phylogenetic methods is a potentially important issue in our attempts to reconstruct the Tree of Life. If missing data are truly problematic, then it may be unwise to include species in an analysis that lack data for some characters (incomplete taxa) or to include characters that lack data for some species. Given the difficulty of obtaining data from all characters for all taxa (e.g., fossils), missing data might seriously impede efforts to reconstruct a comprehensive phylogeny that includes all species. Fortunately, recent simulations and empirical analyses suggest that missing data cells are not themselves problematic, and that incomplete taxa can be accurately placed as long as the overall number of characters in the analysis is large. However, these studies have so far only been conducted on parsimony, likelihood, and neighbor-joining methods. Although Bayesian phylogenetic methods have become widely used in recent years, the effects of missing data on Bayesian analysis have not been adequately studied. Here, we conduct simulations to test whether Bayesian analyses can accurately place incomplete taxa despite extensive missing data. In agreement with previous studies of other methods, we find that Bayesian analyses can accurately reconstruct the position of highly incomplete taxa (i.e., 95% missing data), as long as the overall number of characters in the analysis is large. These results suggest that highly incomplete taxa can be safely included in many Bayesian phylogenetic analyses.  相似文献   

2.
The well-known ciliate, Mesodinium Stein, 1863, is of great importance to marine microbial food webs and is related to the "red tides". However, it is possibly one of the most confusing ciliate taxa in terms of its systematic position: either the morphological or the molecular data excluded it from all the other known assemblages or groups. In the current work, the sequences of small subunit ribosomal RNA(SSU rR NA) genes for all isolates available are analysed and an examination of the secondary structure patterns of related groups is carried out. The results indicate that(1) Mesodinium invariably represents a completely separated and isolated clade positioned between two subphyla of ciliates with very deep branching, which indicates that they should be a primitive or ancestral group for the subphylum Intramacronucleata;(2) the secondary structure of the SSU r RNA of Mesodinium species is unusual in that, while the secondary structure of V4 in Mesodinium sp. has the deletions common to all litostome ciliates, it has more extensive deletions in helix E238 and a longer helix E231;(3) combining the phylogenetic and morphological information, we suggest establishing Mesodiniea cl. nov., including the order Mesodiniida Grain, 1994, belonging to the subphylum Intramacronucleata.  相似文献   

3.
<正>Recent phylogenetic analyses revealed a grade with Ranunculales,Sabiales,Proteales,Trochodendrales, and Buxales as first branching eudicots,with the respective positions of Proteales and Sabiales still lacking statistical confidence.As previous analyses of conserved plastid genes remain inconclusive,we aimed to use and evaluate a representative set of plastid introns(group I:trnL;group II:petD,rpll6,trnK) and intergenic spacers(trnL-F,petB-petD, atpB-rbcL,rps3-rpl16) in comparison to the rapidly evolving matK and slowly evolving atpB and rbcL genes. Overall patterns of micro structural mutations converged across genomic regions,underscoring the existence of a general mutational pattern throughout the plastid genome.Phylogenetic signal differed strongly between functionally and structurally different genomic regions and was highest in matK,followed by spacers,then group II and group I introns.The more conserved atpB and rbcL coding regions showed distinctly lower phylogenetic information content.Parsimony,maximum likelihood,and Bayesian phylogenetic analyses based on the combined dataset of non-coding and rapidly evolving regions(14 000 aligned characters) converged to a backbone topology of eudicots with Ranunculales branching first,a Proteales-Sabiales clade second,followed by Trochodendrales and Buxales. Gunnerales generally appeared as sister to all remaining core eudicots with maximum support.Our results show that a small number of intron and spacer sequences allow similar insights into phylogenetic relationships of eudicots compared to datasets of many combined genes.The non-coding proportion of the plastid genome thus can be considered an important information source for plastid phylogenomics.  相似文献   

4.
5.
The taxonomic treatment within the unigeneric tribe Yinshanieae(Brassicaceae) is controversial, owing to differences in generic delimitation applied to its species. In this study, sequences from nuclear ITS and chloroplast trn L-F regions were used to test the monophyly of Yinshanieae, while two nuclear markers(ITS, ETS) and four chloroplast markers(trnL-F, trn H-psbA, rps16, rpL32-trnL) were used to elucidate the phylogenetic relationships within the tribe. Using maximum parsimony, maximum likelihood, and Bayesian inference methods, we reconstructed the phylogeny of Brassicaceae and Yinshanieae. The results show that Yinshanieae is not a monophyletic group, with the taxa splitting into two distantly related clades: one clade contains four taxa and falls in Lineage I, whereas the other includes all species previously placed in Hilliella and is embedded in the Expanded Lineage II. The tribe Yinshanieae is redefined, and a new tribe, Hillielleae, is proposed based on combined evidence from molecular phylogeny, morphology, and cytology.  相似文献   

6.
A new species of the Gekko(Squa ma ta:Gekkonidae)is described from the border of Sichuan and Yunnan Province,southwest China,based on distinct morphological and molecular features.Gekko jinjiangensis sp.nov.is distinguished from congeners by a combination of the following characters:size small(SVL 50.2–61.6 mm,n=13);nares in contact with rostral;interorbital scales between anterior corners of the eyes 20–24;ventral scales between mental and cloacal slit 146–169;midbody scale rows 111–149;ventral scale rows 31–47;subdigital lamellae on first toe 8–11,on fourth toe 11–15;no webbing in the fingers and toes;with tubercles on uapper surface of fore and hind limbs;precloacal pores 4–5 in males;postcloacal unilateral tubercles 1–2;dorsal surface of body with 8–9 large greyish brown markings between nape and sacrum.In molecular analyses,the new species is sister to G.scabridus,but separated from them by approximately 9.9%–12.2%in genetic divergence as shown by a fragment of the partial mitochondrial ND2 gene.The new species is the highest Gekko with elevation range from 2000 to 2476 m.Further surveys are recommended to better understand the occurrence and population status of the new species.  相似文献   

7.
The ciliate genus Protocruzia belongs to one of the most ambiguous taxa considering its systematic position,possibly as a member of the classes Heterotrichea,Spirotrichea or Karyorelictea,which is tentatively placed into Spirotrichea in Lynn's 2008 system.To test these hypotheses,multigene trees(Bayesian inference,evolutionary distance,maximum parsimony,and maximum likelihood) were constructed using the small subunit rRNA(SSU rRNA) gene,internal transcribed spacer 2(ITS2) and a protein coding gene(histone H...  相似文献   

8.
本文构建了海鲈(Lateolabrax japonicus)头肾全长eDNA文库.PCR方法扩增得到海鲈的核糖体蛋白L8基因,全长848bp,编码257个氨基酸,含有L2及L2-C两个保守区.进化分析结果表明,以L8为参照的进化鉴定结果同经典的分子生物学标准18s鉴定结果十分相似,因此核糖体蛋白L8基因L8可以作为鉴定物种进化程度的新标准.  相似文献   

9.
The Composition Vector Tree (CVTree) is a parameter-free and alignment-free method to infer prokaryotic phylogeny from their complete genomes. It is distinct from the traditional 16S rRNA analysis in both the input data and the methodology. The prokaryotic phylogenetic trees constructed by using the CVTree method agree well with the Bergey's taxonomy in all major groupings and fine branching patterns. Thus, combined use of the CVTree approach and the 16S rRNA analysis may provide an objective and reliable reconstruction of the prokaryotic branch of the Tree of Life.  相似文献   

10.
Members of the Structural Maintenance of Chromosome (SMC) family have long been of interest to molecular and evolutionary biologists for their role in chromosome structural dynamics, particularly sister chromatid cohesion, condensation, and DNA repair. SMC and related proteins are found in all major groups of living organisms and share a common structure of conserved N and C globular domains separated from the conserved hinge domain by long coiled-coil regions. In eukaryotes there are six paralogous proteins that form three heterodimeric pairs, whereas in prokaryotes there is only one SMC protein that homodimerizes. From recently completed genome sequences, we have identified SMC genes from 34 eukaryotes that have not been described in previous reports. Our phylogenetic analysis of these and previously identified SMC genes supports an origin for the vertebrate meiotic SMC1 in the most recent common ancestor since the divergence from invertebrate animals. Additionally, we have identified duplicate copies due to segmental duplications for some of the SMC paralogs in plants and yeast, mainly SMC2 and SMC6, and detected evidence that duplicates of other paralogs were lost, suggesting differential evolution for these genes. Our analysis indicates that the SMC paralogs have been stably maintained at very low copy numbers, even after segmental (genome-wide) duplications. It is possible that such low copy numbers might be selected during eukaryotic evolution, although other possibilities are not ruled out.  相似文献   

11.
Increased taxon sampling greatly reduces phylogenetic error   总被引:1,自引:0,他引:1  
Several authors have argued recently that extensive taxon sampling has a positive and important effect on the accuracy of phylogenetic estimates. However, other authors have argued that there is little benefit of extensive taxon sampling, and so phylogenetic problems can or should be reduced to a few exemplar taxa as a means of reducing the computational complexity of the phylogenetic analysis. In this paper we examined five aspects of study design that may have led to these different perspectives. First, we considered the measurement of phylogenetic error across a wide range of taxon sample sizes, and conclude that the expected error based on randomly selecting trees (which varies by taxon sample size) must be considered in evaluating error in studies of the effects of taxon sampling. Second, we addressed the scope of the phylogenetic problems defined by different samples of taxa, and argue that phylogenetic scope needs to be considered in evaluating the importance of taxon-sampling strategies. Third, we examined the claim that fast and simple tree searches are as effective as more thorough searches at finding near-optimal trees that minimize error. We show that a more complete search of tree space reduces phylogenetic error, especially as the taxon sample size increases. Fourth, we examined the effects of simple versus complex simulation models on taxonomic sampling studies. Although benefits of taxon sampling are apparent for all models, data generated under more complex models of evolution produce higher overall levels of error and show greater positive effects of increased taxon sampling. Fifth, we asked if different phylogenetic optimality criteria show different effects of taxon sampling. Although we found strong differences in effectiveness of different optimality criteria as a function of taxon sample size, increased taxon sampling improved the results from all the common optimality criteria. Nonetheless, the method that showed the lowest overall performance (minimum evolution) also showed the least improvement from increased taxon sampling. Taking each of these results into account re-enforces the conclusion that increased sampling of taxa is one of the most important ways to increase overall phylogenetic accuracy.  相似文献   

12.
The phylogenetic relationships of species are fundamental to any biological investigation, including all evolutionary studies. Accurate inferences of sister group relationships provide the researcher with an historical framework within which the attributes or geographic origin of species (or supraspecific groups) evolved. Taken out of this phylogenetic context, interpretations of evolutionary processes or origins, geographic distributions, or speciation rates and mechanisms, are subject to nothing less than a biological experiment without controls. Cypriniformes is the most diverse clade of freshwater fishes with estimates of diversity of nearly 3,500 species. These fishes display an amazing array of morphological, ecological, behavioral, and geographic diversity and offer a tremendous opportunity to enhance our understanding of the biotic and abiotic factors associated with diversification and adaptation to environments. Given the nearly global distribution of these fishes, they serve as an important model group for a plethora of biological investigations, including indicator species for future climatic changes. The occurrence of the zebrafish, Danio rerio, in this order makes this clade a critical component in understanding and predicting the relationship between mutagenesis and phenotypic expressions in vertebrates, including humans. With the tremendous diversity in Cypriniformes, our understanding of their phylogenetic relationships has not proceeded at an acceptable rate, despite a plethora of morphological and more recent molecular studies. Most studies are pre-Hennigian in origin or include relatively small numbers of taxa. Given that analyses of small numbers of taxa for molecular characters can be compromised by peculiarities of long-branch attraction and nodal-density effect, it is critical that significant progress in our understanding of the relationships of these important fishes occurs with increasing sampling of species to mitigate these potential problems. The recent Cypriniformes Tree of Life initiative is an effort to achieve this goal with morphological and molecular (mitochondrial and nuclear) data. In this early synthesis of our understanding of the phylogenetic relationships of these fishes, all types of data have contributed historically to improving our understanding, but not all analyses are complementary in taxon sampling, thus precluding direct understanding of the impact of taxon sampling on achieving accurate phylogenetic inferences. However, recent molecular studies do provide some insight and in some instances taxon sampling can be implicated as a variable that can influence sister group relationships. Other instances may also exist but without inclusion of more taxa for both mitochondrial and nuclear genes, one cannot distinguish between inferences being dictated by taxon sampling or the origins of the molecular data.  相似文献   

13.
The effect of taxonomic sampling on phylogenetic accuracy under parsimony is examined by simulating nucleotide sequence evolution. Random error is minimized by using very large numbers of simulated characters. This allows estimation of the consistency behavior of parsimony, even for trees with up to 100 taxa. Data were simulated on 8 distinct 100-taxon model trees and analyzed as stratified subsets containing either 25 or 50 taxa, in addition to the full 100-taxon data set. Overall accuracy decreased in a majority of cases when taxa were added. However, the magnitude of change in the cases in which accuracy increased was larger than the magnitude of change in the cases in which accuracy decreased, so, on average, overall accuracy increased as more taxa were included. A stratified sampling scheme was used to assess accuracy for an initial subsample of 25 taxa. The 25-taxon analyses were compared to 50- and 100-taxon analyses that were pruned to include only the original 25 taxa. On average, accuracy for the 25 taxa was improved by taxon addition, but there was considerable variation in the degree of improvement among the model trees and across different rates of substitution.  相似文献   

14.
Recent studies have shown that addition or deletion of taxa from a data matrix can change the estimate of phylogeny. I used 29 data sets from the literature to examine the effect of taxon sampling on phylogeny estimation within data sets. I then used multiple regression to assess the effect of number of taxa, number of characters, homoplasy, strength of support, and tree symmetry on the sensitivity of data sets to taxonomic sampling. Sensitivity to sampling was measured by mapping characters from a matrix of culled taxa onto optimal trees for that reduced matrix and onto the pruned optimal tree for the entire matrix, then comparing the length of the reduced tree to the length of the pruned complete tree. Within-data-set patterns can be described by a second-order equation relating fraction of taxa sampled to sensitivity to sampling. Multiple regression analyses found number of taxa to be a significant predictor of sensitivity to sampling; retention index, number of informative characters, total support index, and tree symmetry were nonsignificant predictors. I derived a predictive regression equation relating fraction of taxa sampled and number of taxa potentially sampled to sensitivity to taxonomic sampling and calculated values for this equation within the bounds of the variables examined. The length difference between the complete tree and a subsampled tree was generally small (average difference of 0-2.9 steps), indicating that subsampling taxa is probably not an important problem for most phylogenetic analyses using up to 20 taxa.  相似文献   

15.
Density of taxon sampling and number/kind of characters are central to achieving the ultimate goals in phylogenetic reconstruction: tree robustness and improved accuracy. In molecular phylogenetics, DNA sequence repositories such as GenBank are potential sources for expanding datasets in two dimensions, taxa and characters, to the level of “supermatrices.” However, the issue of missing characters/genomic regions is generally considered a major impediment to this endeavor. We used here the angiosperm order Caryophyllales to systematically address the impact of missing data when expanding taxon sampling and number of characters in phylogenetic reconstruction. Our analyses show that expansion of taxon sampling by ~13-fold resulted in improved phylogenetic assessment of the Caryophyllales despite up to 38% missing data. Expanding number of characters in the dataset by allowing for up to 100-fold increase in amount of missing data and inclusion of entries with about 40% missing genomic regions did not negatively impact tree structure or robustness, but to the contrary improved both. These results are timely regarding the ongoing efforts to achieve detailed assessment of the tree of life.  相似文献   

16.
Phylogenetic estimation of evolutionary timescales has become routine in biology, forming the basis of a wide range of evolutionary and ecological studies. However, there are various sources of bias that can affect these estimates. We investigated whether tree imbalance, a property that is commonly observed in phylogenetic trees, can lead to reduced accuracy or precision of phylogenetic timescale estimates. We analysed simulated data sets with calibrations at internal nodes and at the tips, taking into consideration different calibration schemes and levels of tree imbalance. We also investigated the effect of tree imbalance on two empirical data sets: mitogenomes from primates and serial samples of the African swine fever virus. In analyses calibrated using dated, heterochronous tips, we found that tree imbalance had a detrimental impact on precision and produced a bias in which the overall timescale was underestimated. A pronounced effect was observed in analyses with shallow calibrations. The greatest decreases in accuracy usually occurred in the age estimates for medium and deep nodes of the tree. In contrast, analyses calibrated at internal nodes did not display a reduction in estimation accuracy or precision due to tree imbalance. Our results suggest that molecular‐clock analyses can be improved by increasing taxon sampling, with the specific aims of including deeper calibrations, breaking up long branches and reducing tree imbalance.  相似文献   

17.
JJ Wiens  J Tiu 《PloS one》2012,7(8):e42925

Background

Phylogenies are essential to many areas of biology, but phylogenetic methods may give incorrect estimates under some conditions. A potentially common scenario of this type is when few taxa are sampled and terminal branches for the sampled taxa are relatively long. However, the best solution in such cases (i.e., sampling more taxa versus more characters) has been highly controversial. A widespread assumption in this debate is that added taxa must be complete (no missing data) in order to save analyses from the negative impacts of limited taxon sampling. Here, we evaluate whether incomplete taxa can also rescue analyses under these conditions (empirically testing predictions from an earlier simulation study).

Methodology/Principal Findings

We utilize DNA sequence data from 16 vertebrate species with well-established phylogenetic relationships. In each replicate, we randomly sample 4 species, estimate their phylogeny (using Bayesian, likelihood, and parsimony methods), and then evaluate whether adding in the remaining 12 species (which have 50, 75, or 90% of their data replaced with missing data cells) can improve phylogenetic accuracy relative to analyzing the 4 complete taxa alone. We find that in those cases where sampling few taxa yields an incorrect estimate, adding taxa with 50% or 75% missing data can frequently (>75% of relevant replicates) rescue Bayesian and likelihood analyses, recovering accurate phylogenies for the original 4 taxa. Even taxa with 90% missing data can sometimes be beneficial.

Conclusions

We show that adding taxa that are highly incomplete can improve phylogenetic accuracy in cases where analyses are misled by limited taxon sampling. These surprising empirical results confirm those from simulations, and show that the benefits of adding taxa may be obtained with unexpectedly small amounts of data. These findings have important implications for the debate on sampling taxa versus characters, and for studies attempting to resolve difficult phylogenetic problems.  相似文献   

18.
A phylogeny for 21 species of spatangoid sea urchins is constructed using data from three genes and results compared with morphology-based phylogenies derived for the same taxa and for a much larger sample of 88 Recent and fossil taxa. Different data sets and methods of analysis generate different phylogenetic hypotheses, although congruence tests show that all molecular approaches produce trees that are congruent with each other. By contrast, the trees generated from morphological data differ significantly according to taxon sampling density and only those with dense sampling (after a posteriori weighting) are congruent with molecular estimates. With limited taxon sampling, secondary reversals in deep-water taxa are interpreted as plesiomorphies, pulling them to a basal position. The addition of fossil taxa with their unique character combinations reveals hidden homoplasy and generates a phylogeny that is compatible with molecular estimates. As homoplasy levels were found to be broadly similar across different anatomical structures in the echinoid test, no one suite of morphological characters can be considered to provide more reliable phylogenetic information. Some traditional groupings are supported, including the grouping of Loveniidae, Brissidae and Spatangidae within the Micrasterina, but the Asterostomatidae is shown to be polyphyletic with members scattered amongst at least five different clades. As these are mostly deep-sea taxa, this finding implies multiple independent invasions into the deep sea.  相似文献   

19.
Previous studies of the phylogeny of land plants based on analysis of 18S ribosomal DNA (rDNA) sequences have generally found weak support for the relationships recovered and at least some obviously spurious relationships, resulting in equivocal inferences of land plant phylogeny. We hypothesized that greater sampling of both characters and taxa would improve inferences of land plant phylogeny based on 18S rDNA sequences. We therefore conducted a phylogenetic analysis of complete (or nearly complete) 18S rDNA sequences for 93 species of land plants and 7 green algal relatives. Parsimony analyses with equal weighting of characters and characters state changes and parsimony analyses weighting (1) stem bases half as much as loop bases and (2) transitions half as much as transversions did not produce substantially different topologies. Although the general structure of the shortest trees is consistent with most hypotheses of land plant phylogeny, several relationships, particularly among major groups of land plants, appear spurious. Increased character and taxon sampling did not substantially improve the performance of 18S rDNA in phylogenetic analyses of land plants, nor did analyses designed to accommodate variation in evolutionary rates among sites. The rate and pattern of 18S rDNA evolution across land plants may limit the usefulness of this gene for phylogeny reconstruction at deep levels of plant phylogeny. We conclude that the mosaic structure of 18S rDNA, consisting of highly conserved and highly variable regions, may contain historical signal at two levels. Rapidly evolving regions are informative for relatively recent divergences (e.g., within angiosperms, seed plants, and ferns), but homoplasy at these sites makes it difficult to resolve relationships among these groups. At deeper levels, changes in the highly conserved regions of small-subunit rDNAs provide signal across all of life. Because constraints imposed by the secondary structure of the rRNA may affect the phylogenetic information content of 18S rDNA, we suggest that 18S rDNA sequences be combined with other data and that methods of analysis be employed to accommodate these differences in evolutionary patterns, particularly across deep divergences in the tree of life.  相似文献   

20.
Long branches in a true phylogeny tend to disrupt hierarchical character covariation (phylogenetic signal) in the distribution of traits among organisms. The distortion of hierarchical structure in character-state matrices can lead to errors in the estimation of phylogenetic relationships and inconsistency of methods of phylogenetic inference. Examination of trees distorted by long-branch attraction will not reveal the identities of problematic taxa, in part because the distortion can mask long branches by reducing inferred branch lengths and through errors in branching order. Here we present a simple method for the detection of taxa whose placement in evolutionary trees is made difficult by the effects of long-branch attraction. The method is an extension of a tree-independent conceptual framework of phylogenetic data exploration (RASA). Taxa that are likely to attract are revealed because long branches leave distinct footprints in the distribution of character states among taxa, and these traces can be directly observed in the error structure of the RASA regression. Problematic taxa are identified using a new diagnostic plot called the taxon variance plot, in which the apparent cladistic and phenetic variances contributed by individual taxa are compared. The procedure for identifying long edges employs algorithms solved in polynomial time and can be applied to morphological, molecular, and mixed characters. The efficacy of the method is demonstrated using simulated evolution and empirical evidence of long branches in a set of recently published sequences. We show that the accuracy of evolutionary trees can be improved by detecting and combating the potentially misleading influences of long-branch taxa.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号