首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
类群取样与系统发育分析精确度之探索   总被引:4,自引:2,他引:4  
Appropriate and extensive taxon sampling is one of the most important determinants of accurate phylogenetic estimation. In addition, accuracy of inferences about evolutionary processes obtained from phylogenetic analyses is improved significantly by thorough taxon sampling efforts. Many recent efforts to improve phylogenetic estimates have focused instead on increasing sequence length or the number of overall characters in the analysis, and this often does have a beneficial effect on the accuracy of phylogenetic analyses. However, phylogenetic analyses of few taxa (but each represented by many characters) can be subject to strong systematic biases, which in turn produce high measures of repeatability (such as bootstrap proportions) in support of incorrect or misleading phylogenetic results. Thus, it is important for phylogeneticists to consider both the sampling of taxa, as well as the sampling of characters, in designing phylogenetic studies. Taxon sampling also improves estimates of evolutionary parameters derived from phylogenetic trees, and is thus important for improved applications of phylogenetic analyses. Analysis of sensitivity to taxon inclusion, the possible effects of long-branch attraction, and sensitivity of parameter estimation for model-based methods should be a part of any careful and thorough phylogenetic analysis. Furthermore, recent improvements in phylogenetic algorithms and in computational power have removed many constraints on analyzing large, thoroughly sampled data sets. Thorough taxon sampling is thus one of the most practical ways to improve the accuracy of phylogenetic estimates, as well as the accuracy of biological inferences that are based on these phylogenetic trees.  相似文献   

2.
3.
The phylogenetic relationships of species are fundamental to any biological investigation, including all evolutionary studies. Accurate inferences of sister group relationships provide the researcher with an historical framework within which the attributes or geographic origin of species (or supraspecific groups) evolved. Taken out of this phylogenetic context, interpretations of evolutionary processes or origins, geographic distributions, or speciation rates and mechanisms, are subject to nothing less than a biological experiment without controls. Cypriniformes is the most diverse clade of freshwater fishes with estimates of diversity of nearly 3,500 species. These fishes display an amazing array of morphological, ecological, behavioral, and geographic diversity and offer a tremendous opportunity to enhance our understanding of the biotic and abiotic factors associated with diversification and adaptation to environments. Given the nearly global distribution of these fishes, they serve as an important model group for a plethora of biological investigations, including indicator species for future climatic changes. The occurrence of the zebrafish, Danio rerio, in this order makes this clade a critical component in understanding and predicting the relationship between mutagenesis and phenotypic expressions in vertebrates, including humans. With the tremendous diversity in Cypriniformes, our understanding of their phylogenetic relationships has not proceeded at an acceptable rate, despite a plethora of morphological and more recent molecular studies. Most studies are pre-Hennigian in origin or include relatively small numbers of taxa. Given that analyses of small numbers of taxa for molecular characters can be compromised by peculiarities of long-branch attraction and nodal-density effect, it is critical that significant progress in our understanding of the relationships of these important fishes occurs with increasing sampling of species to mitigate these potential problems. The recent Cypriniformes Tree of Life initiative is an effort to achieve this goal with morphological and molecular (mitochondrial and nuclear) data. In this early synthesis of our understanding of the phylogenetic relationships of these fishes, all types of data have contributed historically to improving our understanding, but not all analyses are complementary in taxon sampling, thus precluding direct understanding of the impact of taxon sampling on achieving accurate phylogenetic inferences. However, recent molecular studies do provide some insight and in some instances taxon sampling can be implicated as a variable that can influence sister group relationships. Other instances may also exist but without inclusion of more taxa for both mitochondrial and nuclear genes, one cannot distinguish between inferences being dictated by taxon sampling or the origins of the molecular data.  相似文献   

4.
A major assumption of many molecular phylogenetic methods is the homogeneity of nucleotide frequencies among taxa, which refers to the equality of the nucleotide frequency bias among species. Changes in nucleotide frequency among different lineages in a data set are thought to lead to erroneous phylogenetic inference because unrelated clades may appear similar because of evolutionarily unrelated similarities in nucleotide frequencies. We tested the effects of the heterogeneity of nucleotide frequency bias on phylogenetic inference, along with the interaction between this heterogeneity and stratified taxon sampling, by means of computer simulations using evolutionary parameters derived from genomic databases. We found that the phylogenetic trees inferred from data sets simulated under realistic, observed levels of heterogeneity for mammalian genes were reconstructed with accuracy comparable to those simulated with homogeneous nucleotide frequencies; the results hold for Neighbor-Joining, minimum evolution, maximum parsimony, and maximum-likelihood methods. The LogDet distance method, specifically designed to deal with heterogeneous nucleotide frequencies, does not perform better than distance methods that assume substitution pattern homogeneity among sequences. In these specific simulation conditions, we did not find a significant interaction between phylogenetic accuracy and substitution pattern heterogeneity among lineages, even when the taxon sampling is increased.  相似文献   

5.
    
Ecologists are increasingly making use of molecular phylogenies, especially in the fields of community ecology and conservation. However, these phylogenies are often used without full appreciation of their underlying assumptions and uncertainties. A frequent practice in ecological studies is inferring a phylogeny with molecular data from taxa only within the community of interest. These “inferred community phylogenies” are inherently biased in their taxon sampling. Despite the importance of comprehensive sampling in constructing phylogenies, the implications of using inferred community phylogenies in ecological studies have not been examined. Here, we evaluate how taxon sampling affects the quantification and comparison of community phylogenetic diversity using both simulated and empirical data sets. We demonstrate that inferred community trees greatly underestimate phylogenetic diversity and that the probability of incorrectly ranking community diversity can reach up to 25%, depending on the dating methods employed. We argue that to reach reliable conclusions, ecological studies must improve their taxon sampling and generate the best phylogeny possible.  相似文献   

6.
    
  1. The amount and patterns of phylodiversity in a community are often used to draw inferences about the local and historical factors affecting community assembly and can be used to prioritize communities and locations for conservation. Because measures of phylodiversity are based on the topology and branch lengths of phylogenetic trees, which are affected by the number and diversity of taxa in the tree, these analyses may be sensitive to changes in taxon sampling and tree reconstruction methods.
  2. To investigate the effects of taxon sampling and tree reconstruction methods on measures of phylodiversity, we investigated the community phylogenetics of the Ordway‐Swisher Biological Station (Florida), which is home to over 600 species of vascular plants. We studied the effects of (a) the number of taxa included in the regional phylogeny; (b) random versus targeted sampling of species to assemble the regional species pool; (c) including only species from specific clades rather than broad sampling; (d) using trees reconstructed directly for the taxa under study compared to trees pruned from a larger reconstructed tree; and (e) using phylograms compared to chronograms.
  3. We found that including more taxa in a study increases the likelihood of observing significantly nonrandom phylogenetic patterns. However, there were no consistent trends in the phylodiversity patterns based on random taxon sampling compared to targeted sampling, or within individual clades compared to the complete dataset. Using pruned and reconstructed phylogenies resulted in similar patterns of phylodiversity, while chronograms in some cases led to significantly different results from phylograms.
  4. The methods commonly used in community phylogenetic studies can significantly impact the results, potentially influencing both inferences of community assembly and conservation decisions. We highlight the need for both careful selection of methods in community phylogenetic studies and appropriate interpretation of results, depending on the specific questions to be addressed.
  相似文献   

7.
8.
    
Outgroup sampling is a fundamental step in the design of phylogenetic analyses, independent of optimality criterion, taxonomic group, or source of evidence. Studies have demonstrated the efficient analysis of many thousands of terminals, all of which could be included in any empirical investigation, yet outgroup samples typically include only a small number of terminals. Most discussion of outgroup sampling centers on employing “correct” or “appropriate” outgroup terminals to increase “accuracy” or “reliability” by preventing “errors” such as long branch attraction and “incorrect” ingroup rooting. As an alternative, I develop a theory of outgroup sampling grounded in the logic of scientific discovery, whereby the objective is to test nested hypotheses of ingroup topology and character‐state transformation as severely as possible by incorporating outgroup terminals in unconstrained, simultaneous analysis, using background knowledge to select the terminals that have the greatest chance of refuting those hypotheses. This framework provides a logical basis for selecting outgroup taxa but does not provide grounds for limiting the outgroup sample, given that, ceteris paribus, testability and explanatory power increase with the inclusion of additional terminals. Therefore, I propose the ancillary procedure of successively expanding the outgroup sample until ingroup hypotheses become stable (insensitive) to increased sampling, with each expansion guided by the scientific objectives of outgroup sampling. This is a heuristic procedure that does not prevent more outgroup terminals from being sampled or guarantee that ingroup hypotheses will remain insensitive to further outgroup expansion, and it has no bearing on the objective support of a given hypothesis. Nevertheless, it provides an objective, empirical basis for limiting outgroup sampling in a given research cycle. I illustrate this procedure by examining the effect of successive outgroup expansion on the relationships among the poison frog genera Adelphobates, Dendrobates, and Oophaga.  相似文献   

9.
Almost a decade ago, a new phylogeny of bilaterian animals was inferred from small-subunit ribosomal RNA (rRNA) that claimed the monophyly of two major groups of protostome animals: Ecdysozoa (e.g., arthropods, nematodes, onychophorans, and tardigrades) and Lophotrochozoa (e.g., annelids, molluscs, platyhelminths, brachiopods, and rotifers). However, it received little additional support. In fact, several multigene analyses strongly argued against this new phylogeny. These latter studies were based on a large amount of sequence data and therefore showed an apparently strong statistical support. Yet, they covered only a few taxa (those for which complete genomes were available), making systematic artifacts of tree reconstruction more probable. Here we expand this sparse taxonomic sampling and analyze a large data set (146 genes, 35,371 positions) from a diverse sample of animals (35 species). Our study demonstrates that the incongruences observed between rRNA and multigene analyses were indeed due to long-branch attraction artifacts, illustrating the enormous impact of systematic biases on phylogenomic studies. A refined analysis of our data set excluding the most biased genes provides strong support in favor of the new animal phylogeny and in addition suggests that urochordates are more closely related to vertebrates than are cephalochordates. These findings have important implications for the interpretation of morphological and genomic data.  相似文献   

10.
The "star paradox" in phylogenetics is the tendency for a particular resolved tree to be sometimes strongly supported even when the data is generated by an unresolved ("star") tree. There have been contrary claims as to whether this phenomenon persists when very long sequences are considered. This note settles one aspect of this debate by proving mathematically that the chance that a resolved tree could be strongly supported stays above some strictly positive number, even as the length of the sequences becomes very large.  相似文献   

11.
12.
Random sampling is an important statistical assumption, but virtually impossible when sampling a wild species as we cannot know where all the individuals exist. While interpopulation or intrataxa sampling methods have been developed, there are currently few intrataxon sampling methods to objectively decide where to sample wild taxa. We suggest a new sampling method which computes appropriate sampling locations from coordinates, assuming geographical autocorrelation of phylogeny within a taxon (isolation‐by‐distance). The computed locations encompass the highest genetic diversity, providing a genetically representative sample. In addition, it can utilize presence/absence information during sampling to reoptimize sampling scheme. Comparing to the single existing method of the similar purpose, the merits of ours is unnecessity of environmental data resulting in easy application, and is theoretically deduced. We tested this method using published phylogeographical data. The test result was generally encouraging, but the method failed where species showed uniform genetic structure or recent distribution expansion which violate the assumption of geographical autocorrelation of phylogeny. Though simple, our method constructs a methodological and statistical foundation for sampling wild species, and is applicable to revising taxonomic study and conservation biology.  相似文献   

13.
生命条形码与生命之树   总被引:1,自引:0,他引:1  
生命条形码和生命之树的研究与应用在近十年内备受关注,成为生命科学研究领域的两个热点。本文综述了生命条形码和生命之树的概念来源、研究现状、面临问题与解决方案,并对其发展前景进行了展望。生命之树概念的形成有着悠久的历史渊源,DNA条形码的提出和实施则只有十年的历史,两者均得益于测序技术和生物信息技术的蓬勃发展;但两者的目的不同,生命条形码技术旨在实现对物种的快速鉴定,而生命之树研究的主要目的则是重建生命世界的起源和进化历史以及各生物类群之间的亲缘关系,因此应根据两者不同的目标任务而采取相应的发展思路和顶层设计。本文针对目前生命条形码和生命之树研究领域遇到的瓶颈和问题进行了阐述,并提出了相应的解决方案。最后,作者建议我国学者抓住机遇.与多个领域的学者和工程技术人员广泛合作,推动DNA条形码鉴定技术和生命之树理论研究的快速发展。  相似文献   

14.
  总被引:2,自引:0,他引:2  
Inferring the relationships among Bilateria has been an active and controversial research area since Haeckel. The lack of a sufficient number of phylogenetically reliable characters was the main limitation of traditional phylogenies based on morphology. With the advent of molecular data, this problem has been replaced by another one, statistical inconsistency, which stems from an erroneous interpretation of convergences induced by multiple changes. The analysis of alignments rich in both genes and species, combined with a probabilistic method (maximum likelihood or Bayesian) using sophisticated models of sequence evolution, should alleviate these two major limitations. We applied this approach to a dataset of 94 genes and 79 species using CAT, a previously developed model accounting for site-specific amino acid replacement patterns. The resulting tree is in good agreement with current knowledge: the monophyly of most major groups (e.g. Chordata, Arthropoda, Lophotrochozoa, Ecdysozoa, Protostomia) was recovered with high support. Two results are surprising and are discussed in an evo-devo framework: the sister-group relationship of Platyhelminthes and Annelida to the exclusion of Mollusca, contradicting the Neotrochozoa hypothesis, and, with a lower statistical support, the paraphyly of Deuterostomia. These results, in particular the status of deuterostomes, need further confirmation, both through increased taxonomic sampling, and future improvements of probabilistic models.  相似文献   

15.
In recent studies, phylogenetic networks have been derived from so-called multilabeled trees in order to understand the origins of certain polyploids. Although the trees used in these studies were constructed using sophisticated techniques in phylogenetic analysis, the presented networks were inferred using ad hoc arguments that cannot be easily extended to larger, more complicated examples. In this paper, we present a general method for constructing such networks, which takes as input a multilabeled phylogenetic tree and outputs a phylogenetic network with certain desirable properties. To illustrate the applicability of our method, we discuss its use in reconstructing the evolutionary history of plant allopolyploids. We conclude with a discussion concerning possible future directions. The network construction method has been implemented and is freely available for use from http://www.uea.ac.uk/ approximately a043878/padre.html.  相似文献   

16.
The relative contribution of taxon number and gene number to accuracy in phylogenetic inference is a major issue in phylogenetics and of central importance to the choice of experimental strategies for the successful reconstruction of a broad sketch of the tree of life. Maximization of the number of taxa sampled is the strategy favored by most phylogeneticists, although its necessity remains the subject of debate. Vast increases in gene number are now possible due to advances in genomics, but large numbers of genes will be available for only modest numbers of taxa, raising the question of whether such genome-scale phylogenies will be robust to the addition of taxa. To examine the relative benefit of increasing taxon number or gene number to phylogenetic accuracy, we have developed an assay that utilizes the symmetric difference tree distance as a measure of phylogenetic accuracy. We have applied this assay to a genome-scale data matrix containing 106 genes from 14 yeast species. Our results show that increasing taxon number correlates with a slight decrease in phylogenetic accuracy. In contrast, increasing gene number has a significant positive effect on phylogenetic accuracy. Analyses of an additional taxon-rich data matrix from the same yeast clade show that taxon number does not have a significant effect on phylogenetic accuracy. The positive effect of gene number and the lack of effect of taxon number on phylogenetic accuracy are also corroborated by analyses of two data matrices from mammals and angiosperm plants, respectively. We conclude that, for typical data sets, the number of genes utilized may be a more important determinant of phylogenetic accuracy than taxon number.  相似文献   

17.
Summary The existence of two families of genes coding for hexameric glutamate dehydrogenases has been deduced from the alignment of 21 primary sequences and the determination of the percentages of similarity between each pair of proteins. Each family could also be characterized by specific motifs. One family (Family 1) was composed of gdh genes from six eubacteria and six lower eukaryotes (the primitive protozoan Giardia lamblia, the green alga Chlorella sorokiniana, and several fungi and yeasts). The other one (Family 11) was composed of gdh genes from two eubacteria, two archaebacteria, and five higher eukaryotes (vertebrates). Reconstruction of phylogenetic trees using several parsimony and distance methods confirmed the existence of these two families. Therefore, these results reinforced our previously proposed hypothesis that two close but already different gdh genes were present in the last common ancestor to the three Ur-kingdoms (eubacteria, archaebacteria, and eukaryotes). The branching order of the different species of Family I was found to be the same whatever the method of tree reconstruction although it varied slightly according the region analyzed. Similarly, the topological positions of eubacteria and eukaryotes of Family II were independent of the method used. However, the branching of the two archaebacteria in Family II appeared to be unexpected: (1) the thermoacidophilic Sulfolobus solfataricus was found clustered with the two eubacteria of this family both in parsimony and distance trees, a situation not predicted by either one of the contradictory trees recently proposed; and (2) the branching of the halophilic Halobacterium salinarium varied according to the method of tree construction: it was closer to the eubacteria in the maximum parsimony tree and to eukaryotesin distance trees. Therefore, whatever the actual position of the halophilic species, archaebacteria did not appear to be monophyletic in these gdh gene trees. This result questions the firmness of the presently accepted interpretation of previous protein trees which were supposed to root unambiguously the universal tree of life and place the archaebacteria in this tree. Offprint requests to: B. Labedan  相似文献   

18.
    
In order to study the tempo and the mode of spider orb web evolution and diversification, we conducted a phylogenetic analysis using six genetic markers along with a comprehensive taxon sample. The present analyses are the first to recover the monophyly of orb-weaving spiders based solely on DNA sequence data and an extensive taxon sample. We present the first dated orb weaver phylogeny. Our results suggest that orb weavers appeared by the Middle Triassic and underwent a rapid diversification during the end of the Triassic and Early Jurassic. By the second half of the Jurassic, most of the extant orb-weaving families and web designs were already present. The processes that may have given origin to this diversification of lineages and web architectures are discussed. A combination of biotic factors, such as key innovations in web design and silk composition, as well as abiotic environmental changes, may have played important roles in the diversification of orb weavers. Our analyses also show that increased taxon sampling density in both ingroups and outgroups greatly improves phylogenetic accuracy even when extensive data are missing. This effect is particularly important when addition of character data improves gene overlap.  相似文献   

19.
20.
We provide 15 new primers for amplifying and sequencing the mitochondrial ND4/ND5 gene region of the Cypriniformes in an attempt to resolve relationships of this diverse group of freshwater fishes with extensive taxonomic sampling. Sequences from this region have the following desirable characteristics for phylogenetic analyses, some of which are lacking from the more commonly used cyt b and 12S/16S rRNA genes: they are (1) easy to align, (2) relatively long (ca. 3.4 kb), and (3) contain more phylogenetically informative variation at 1st and 2nd codon positions. Moreover, the ND4/ND5 gene region is easy to amplify and sequence when employing the protocol suggested herein.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号