首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Background

Whenever different data sets arrive at conflicting phylogenetic hypotheses, only testable causal explanations of sources of errors in at least one of the data sets allow us to critically choose among the conflicting hypotheses of relationships. The large (28S) and small (18S) subunit rRNAs are among the most popular markers for studies of deep phylogenies. However, some nodes supported by this data are suspected of being artifacts caused by peculiarities of the evolution of these molecules. Arthropod phylogeny is an especially controversial subject dotted with conflicting hypotheses which are dependent on data set and method of reconstruction. We assume that phylogenetic analyses based on these genes can be improved further i) by enlarging the taxon sample and ii) employing more realistic models of sequence evolution incorporating non-stationary substitution processes and iii) considering covariation and pairing of sites in rRNA-genes.

Results

We analyzed a large set of arthropod sequences, applied new tools for quality control of data prior to tree reconstruction, and increased the biological realism of substitution models. Although the split-decomposition network indicated a high noise content in the data set, our measures were able to both improve the analyses and give causal explanations for some incongruities mentioned from analyses of rRNA sequences. However, misleading effects did not completely disappear.

Conclusion

Analyses of data sets that result in ambiguous phylogenetic hypotheses demand for methods, which do not only filter stochastic noise, but likewise allow to differentiate phylogenetic signal from systematic biases. Such methods can only rely on our findings regarding the evolution of the analyzed data. Analyses on independent data sets then are crucial to test the plausibility of the results. Our approach can easily be extended to genomic data, as well, whereby layers of quality assessment are set up applicable to phylogenetic reconstructions in general.  相似文献   

2.
A morphological data set and three sources of data from the chloroplast genome (two genes and a restriction site survey) were used to reconstruct the phylogenetic history of the pickerelweed family Pontederiaceae. The chloroplast data converged towards a single tree, presumably the true chloroplast phylogeny of the family. Unrooted trees estimated from each of the three chloroplast data sets were identical or extremely similar in shape to each other and mostly robustly supported. There was no evidence of significant heterogeneity among the data sets, and the few topological differences seen among unrooted trees from each chloroplast data set are probably artifacts of sampling error on short branches. Despite well-documented differences in rates of evolution for different characters in individual data sets, equally weighted parsimony permits accurate reconstructions of chloroplast relationships in Pontederiaceae. A separate morphology-based data set yielded trees that were very different from the chloroplast trees. Although there was substantial support from the morphological evidence for several major clades supported by chloroplast trees, most of the conflicting phylogenetic structure on the morphology trees was not robust. Nonetheless, several statistical tests of incongruence indicate significant heterogeneity between molecules and morphology. The source of this apparent incongruence appears to be a low ratio of phylogenetic signal to noise in the morphological data.  相似文献   

3.
Probabilistic tests of topology offer a powerful means of evaluating competing phylogenetic hypotheses. The performance of the nonparametric Shimodaira-Hasegawa (SH) test, the parametric Swofford-Olsen-Waddell-Hillis (SOWH) test, and Bayesian posterior probabilities were explored for five data sets for which all the phylogenetic relationships are known with a very high degree of certainty. These results are consistent with previous simulation studies that have indicated a tendency for the SOWH test to be prone to generating Type 1 errors because of model misspecification coupled with branch length heterogeneity. These results also suggest that the SOWH test may accord overconfidence in the true topology when the null hypothesis is in fact correct. In contrast, the SH test was observed to be much more conservative, even under high substitution rates and branch length heterogeneity. For some of those data sets where the SOWH test proved misleading, the Bayesian posterior probabilities were also misleading. The results of all tests were strongly influenced by the exact substitution model assumptions. Simple models, especially those that assume rate homogeneity among sites, had a higher Type 1 error rate and were more likely to generate misleading posterior probabilities. For some of these data sets, the commonly used substitution models appear to be inadequate for estimating appropriate levels of uncertainty with the SOWH test and Bayesian methods. Reasons for the differences in statistical power between the two maximum likelihood tests are discussed and are contrasted with the Bayesian approach.  相似文献   

4.
The standard paradigm postulates that the human mitochondrial genome (mtDNA) is strictly maternally inherited and that, consequently, mtDNA lineages are clonal. As a result of mtDNA clonality, phylogenetic and population genetic analyses should therefore be free of the complexities imposed by biparental recombination. The use of mtDNA in analyses of human molecular evolution is contingent, in fact, on clonality, which is also a condition that is critical both for forensic studies and for understanding the transmission of pathogenic mtDNA mutations within families. This paradigm, however, has been challenged recently by Eyre-Walker and colleagues. Using two different tests, they have concluded that recombination has contributed to the distribution of mtDNA polymorphisms within the human population. We have assembled a database that comprises the complete sequences of 64 European and 2 African mtDNAs. When this set of sequences was analyzed using any of three measures of linkage disequilibrium, one of the tests of Eyre-Walker and colleagues, there was no evidence for mtDNA recombination. When their test for excess homoplasies was applied to our set of sequences, only a slight excess of homoplasies was observed. We discuss possible reasons that our results differ from those of Eyre-Walker and colleagues. When we take the various results together, our conclusion is that mtDNA recombination has not been sufficiently frequent during human evolution to overturn the standard paradigm.  相似文献   

5.
Since recombination leads to the generation of mosaic genomes that violate the assumption of traditional phylogenetic methods that sequence evolution can be accurately described by a single tree, results and conclusions based on phylogenetic analysis of data sets including recombinant sequences can be severely misleading. Many methods are able to adequately detect recombination between diverse sequences, for example between different HIV-1 subtypes. More problematic is the identification of recombinants among closely related sequences such as a viral population within a host. We describe a simple algorithmic procedure that enables detection of intra-host recombinants based on split-decomposition networks and a robust statistical test for recombination. By applying this algorithm to several published HIV-1 data sets we conclude that intra-host recombination was significantly underestimated in previous studies and that up to one-third of the env sequences longitudinally sampled from a given subject can be of recombinant origin. The results show that our procedure can be a valuable exploratory tool for detection of recombinant sequences before phylogenetic analysis, and also suggest that HIV-1 recombination in vivo is far more frequent and significant than previously thought.  相似文献   

6.
The gametophytic self-incompatibility locus has been thought to be a nonrecombining genomic region. Inferences have been made, however, about the functional importance of different parts of the S-locus, based on differences in the levels of variability along the gene, and this is valid only if recombination occurs. It is thus important to test whether recombination occurs within and near the S-locus. Several recent attempts to test this have reached conflicting conclusions. In this study, we examine a large data set on sequence variation at the S-locus in several species with gametophytic self-incompatibility systems, in the Solanaceae, Rosaceae and Scrophulariaceae. We use the longest sequences available to test for recombination based on linkage disequilibrium between polymorphic sites in the S-locus. The relationship between linkage disequilibrium and physical distance between the sites suggests rare intragenic exchange in the evolutionary history of four species of Solanaceae and two species of Rosaceae.  相似文献   

7.
Previous analyses of diploid nuclear genotypes have concluded that recombination has occurred in populations of the yeast Candida albicans. To address the possibilities of clonality and recombination in an effectively haploid genome, we sequenced seven regions of mitochondrial DNA (mtDNA) in 45 strains of C. albicans from human immunodeficiency virus-positive patients in Toronto, Canada, and 3 standard reference isolates of C. albicans, CA, CAI4, and WO-1. Among a total of 2,553 nucleotides in the seven regions, 62 polymorphic nucleotide sites and seven indels defined nine distinct mtDNA haplotypes among the 48 strains. Five of these haplotypes occurred in more than one strain, indicating clonal proliferation of mtDNA. Phylogenetic analysis of mtDNA haplotypes resulted in one most-parsimonious tree. Most of the nucleotide sites undergoing parallel change in this tree were clustered in blocks that corresponded to sequenced regions. Because of the existence of these blocks, the apparent homoplasy can be attributed to infrequent, past genetic exchange and recombination between individuals and cannot be attributed to parallel mutation. Among strains sharing the same mtDNA haplotypes, multilocus nuclear genotypes were more similar than expected from a random comparison of nuclear DNA genotypes, suggesting that clonal proliferation of the mitochondrial genome was accompanied by clonal proliferation of the nuclear genome.  相似文献   

8.
9.
Micropathogens (viruses, bacteria, fungi, parasitic protozoa) share a common trait, which is partial clonality, with wide variance in the respective influence of clonality and sexual recombination on the dynamics and evolution of taxa. The discrimination of distinct lineages and the reconstruction of their phylogenetic history are key information to infer their biomedical properties. However, the phylogenetic picture is often clouded by occasional events of recombination across divergent lineages, limiting the relevance of classical phylogenetic analysis and dichotomic trees. We have applied a network analysis based on graph theory to illustrate the relationships among genotypes of Trypanosoma cruzi, the parasitic protozoan responsible for Chagas disease, to identify major lineages and to unravel their past history of divergence and possible recombination events. At the scale of T. cruzi subspecific diversity, graph theory-based networks applied to 22 isoenzyme loci (262 distinct Multi-Locus-Enzyme-Electrophoresis -MLEE) and 19 microsatellite loci (66 Multi-Locus-Genotypes -MLG) fully confirms the high clustering of genotypes into major lineages or “near-clades”. The release of the dichotomic constraint associated with phylogenetic reconstruction usually applied to Multilocus data allows identifying putative hybrids and their parental lineages. Reticulate topology suggests a slightly different history for some of the main “near-clades”, and a possibly more complex origin for the putative hybrids than hitherto proposed. Finally the sub-network of the near-clade T. cruzi I (28 MLG) shows a clustering subdivision into three differentiated lesser near-clades (“Russian doll pattern”), which confirms the hypothesis recently proposed by other investigators. The present study broadens and clarifies the hypotheses previously obtained from classical markers on the same sets of data, which demonstrates the added value of this approach. This underlines the potential of graph theory-based network analysis for describing the nature and relationships of major pathogens, thereby opening stimulating prospects to unravel the organization, dynamics and history of major micropathogen lineages.  相似文献   

10.
Most phylogenetic tree estimation methods assume that there is a single set of hierarchical relationships among sequences in a data set for all sites along an alignment. Mosaic sequences produced by past recombination events will violate this assumption and may lead to misleading results from a phylogenetic analysis due to the imposition of a single tree along the entire alignment. Therefore, the detection of past recombination is an important first step in an analysis. A Bayesian model for the changes in topology caused by recombination events is described here. This model relaxes the assumption of one topology for all sites in an alignment and uses the theory of Hidden Markov models to facilitate calculations, the hidden states being the underlying topologies at each site in the data set. Changes in topology along the multiple sequence alignment are estimated by means of the maximum a posteriori (MAP) estimate. The performance of the MAP estimate is assessed by application of the model to data sets of four sequences, both simulated and real.  相似文献   

11.
Detecting the node-density artifact in phylogeny reconstruction   总被引:4,自引:0,他引:4  
The node-density effect is an artifact of phylogeny reconstruction that can cause branch lengths to be underestimated in areas of the tree with fewer taxa. Webster, Payne, and Pagel (2003, Science 301:478) introduced a statistical procedure (the "delta" test) to detect this artifact, and here we report the results of computer simulations that examine the test's performance. In a sample of 50,000 random data sets, we find that the delta test detects the artifact in 94.4% of cases in which it is present. When the artifact is not present (n = 10,000 simulated data sets) the test showed a type I error rate of approximately 1.69%, incorrectly reporting the artifact in 169 data sets. Three measures of tree shape or "balance" failed to predict the size of the node-density effect. This may reflect the relative homogeneity of our randomly generated topologies, but emphasizes that nearly any topology can suffer from the artifact, the effect not being confined only to highly unevenly sampled or otherwise imbalanced trees. The ability to screen phylogenies for the node-density artifact is important for phylogenetic inference and for researchers using phylogenetic trees to infer evolutionary processes, including their use in molecular clock dating.  相似文献   

12.
genclone 1.0 is designed for studying clonality and its spatial components using genotype data with molecular markers from haploid or diploid organisms. genclone 1.0 performs the following tasks. (i) discriminates distinct multilocus genotypes (MLGs), and uses permutation and resampling approaches to test for the reliability of sets of loci and sampling units for estimating genotypic and genetic diversity (a procedure also useful for nonclonal organisms); (ii) computes statistics to test for clonal propagation or clonal identity of replicates; (iii) computes various indices describing genotypic diversity; and (iv) summarizes the spatial organization of MLGs with adapted spatial autocorrelation methods and clonal subrange estimates.  相似文献   

13.
Short interspersed nuclear elements (SINEs) have been used to generate unambiguous phylogenetic topologies relating eukaryotic taxa. The irreversible nature of SINE retroposition is supported by a large body of comparative genome data and is a fundamental assumption inherent in the value of this qualitative method of inference. Here, we assess the key assumption of unidirectional SINE insertion by comparing the SINE insertion-derived topology and the phylogenetic tree based on seven independent loci of five taxa in the order Cetartiodactyla (Cetacea + Artiodactyla). The data sets and analyses were largely independent, but the loci were, by definition, linked, and thus their consistency supported an irreversible pattern of SINE retroposition. Moreover, our analyses of the flanking sequences provided estimates of divergence times among cetartiodactyl lineages unavailable from SINE insertion analysis alone. Unexpected rate heterogeneity among sites of SINE-flanking sequences and other noncoding DNA sequences were observed. Sequence simulations suggest that this rate heterogeneity may be an artifact resulting from the inaccuracies of the substitution model used.  相似文献   

14.
Standardizing methods to address clonality in population studies   总被引:3,自引:1,他引:2  
Although clonal species are dominant in many habitats, from unicellular organisms to plants and animals, ecological and particularly evolutionary studies on clonal species have been strongly limited by the difficulty in assessing the number, size and longevity of genetic individuals within a population. The development of molecular markers has allowed progress in this area, and although allozymes remain of limited use due to their typically low level of polymorphism, more polymorphic markers have been discovered during the last decades, supplying powerful tools to overcome the problem of clonality assessment. However, population genetics studies on clonal organisms lack a standardized framework to assess clonality, and to adapt conventional data analyses to account for the potential bias due to the possible replication of the same individuals in the sampling. Moreover, existing studies used a variety of indices to describe clonal diversity and structure such that comparison among studies is difficult at best. We emphasize the need for standardizing studies on clonal organisms, and particularly on clonal plants, in order to clarify the way clonality is taken into account in sampling designs and data analysis, and to allow further comparison of results reported in distinct studies. In order to provide a first step towards a standardized framework to address clonality in population studies, we review, on the basis of a thorough revision of the literature on population structure of clonal plants and of a complementary revision on other clonal organisms, the indices and statistics used so far to estimate genotypic or clonal diversity and to describe clonal structure in plants. We examine their advantages and weaknesses as well as various conceptual issues associated with statistical analyses of population genetics data on clonal organisms. We do so by testing them on results from simulations, as well as on two empirical data sets of microsatellites of the seagrasses Posidonia oceanica and Cymodocea nodosa. Finally, we also propose a selection of new indices and methods to estimate clonal diversity and describe clonal structure in a way that should facilitate comparison between future studies on clonal plants, most of which may be of interest for clonal organisms in general.  相似文献   

15.
The mammalian tick-borne flavivirus group (MTBFG) contains viruses associated with important human and animal diseases such as encephalitis and hemorrhagic fever. In contrast to mosquito-borne flaviviruses where recombination events are frequent, the evolutionary dynamic within the MTBFG was believed to be essentially clonal. This assumption was challenged with the recent report of several homologous recombinations within the Tick-borne encephalitis virus (TBEV). We performed a thorough analysis of publicly available genomes in this group and found no compelling evidence for the previously identified recombinations. However, our results show for the first time that demonstrable recombination (i.e., with large statistical support and strong phylogenetic evidences) has occurred in the MTBFG, more specifically within the Louping ill virus lineage. Putative parents, recombinant strains and breakpoints were further tested for statistical significance using phylogenetic methods. We investigated the time of divergence between the recombinant and parental strains in a Bayesian framework. The recombination was estimated to have occurred during a window of 282 to 76 years before the present. By unravelling the temporal setting of the event, we adduce hypotheses about the ecological conditions that could account for the observed recombination.  相似文献   

16.
Backbone relationships within the large eupolypod II clade, which includes nearly a third of extant fern species, have resisted elucidation by both molecular and morphological data. Earlier studies suggest that much of the phylogenetic intractability of this group is due to three factors: (i) a long root that reduces apparent levels of support in the ingroup; (ii) long ingroup branches subtended by a series of very short backbone internodes (the "ancient rapid radiation" model); and (iii) significantly heterogeneous lineage-specific rates of substitution. To resolve the eupolypod II phylogeny, with a particular emphasis on the backbone internodes, we assembled a data set of five plastid loci (atpA, atpB, matK, rbcL, and trnG-R) from a sample of 81 accessions selected to capture the deepest divergences in the clade. We then evaluated our phylogenetic hypothesis against potential confounding factors, including those induced by rooting, ancient rapid radiation, rate heterogeneity, and the Bayesian star-tree paradox artifact. While the strong support we inferred for the backbone relationships proved robust to these potential problems, their investigation revealed unexpected model-mediated impacts of outgroup composition, divergent effects of methods for countering the star-tree paradox artifact, and gave no support to concerns about the applicability of the unrooted model to data sets with heterogeneous lineage-specific rates of substitution. This study is among few to investigate these factors with empirical data, and the first to compare the performance of the two primary methods for overcoming the Bayesian star-tree paradox artifact. Among the significant phylogenetic results is the near-complete support along the eupolypod II backbone, the demonstrated paraphyly of Woodsiaceae as currently circumscribed, and the well-supported placement of the enigmatic genera Homalosorus, Diplaziopsis, and Woodsia.  相似文献   

17.
TT virus (TTV) has a remarkable genetic heterogeneity. To study TTV evolution, phylogenetic analyses were performed on 739 DNA sequences mapping in the N22 region of ORF1. Analysis of neighbor-joining consensus trees shows significant differences between DNA and protein phylogeny. Median joining networks phylogenetic clustering indicates that DNA sequence analysis is biased by homoplasy (i.e., genetic variability not originated by descent), indicative of either hypermutation or recombination. Statistical analysis shows that the significant excess of homoplasy is due to frequent recombination among closely related strains. Recombination events imply that the transmission of TTV is not clonal and provide the necessary basis to explain (i) the high degree of genetic divergence between TTV isolates, (ii) the lack of population structure on a world scale, and (iii) the number of highly divergent strains that seems typical of this virus. We show that recombination phenomena can be detected by phylogenetic analyses in very short sequences when a sufficiently large data set is available.  相似文献   

18.
Consequences of recombination on traditional phylogenetic analysis   总被引:38,自引:0,他引:38  
Schierup MH  Hein J 《Genetics》2000,156(2):879-891
We investigate the shape of a phylogenetic tree reconstructed from sequences evolving under the coalescent with recombination. The motivation is that evolutionary inferences are often made from phylogenetic trees reconstructed from population data even though recombination may well occur (mtDNA or viral sequences) or does occur (nuclear sequences). We investigate the size and direction of biases when a single tree is reconstructed ignoring recombination. Standard software (PHYLIP) was used to construct the best phylogenetic tree from sequences simulated under the coalescent with recombination. With recombination present, the length of terminal branches and the total branch length are larger, and the time to the most recent common ancestor smaller, than for a tree reconstructed from sequences evolving with no recombination. The effects are pronounced even for small levels of recombination that may not be immediately detectable in a data set. The phylogenies when recombination is present superficially resemble phylogenies for sequences from an exponentially growing population. However, exponential growth has a different effect on statistics such as Tajima's D. Furthermore, ignoring recombination leads to a large overestimation of the substitution rate heterogeneity and the loss of the molecular clock. These results are discussed in relation to viral and mtDNA data sets.  相似文献   

19.
It has been suggested that clonality provides reproductive assurance in cross-fertilizing species subject to pollen limitation, relieving one of the main selective pressures favoring the evolution of self-fertilization. According to this hypothesis, cross-fertilizing species subject to pollen limitation should often be clonal. Here, we investigated the association between clonality and a genetic mechanism enforcing outcrossing, self-incompatibility, in Solanum (Solanaceae). We collected self-incompatibility and clonality information on 87 species, and looked for an association between these two traits. To account for the contribution of shared evolutionary history to this association, we incorporated phylogenetic information from chloroplast (NADH dehydrogenase subunit F) sequence data. We found that self-incompatibility is strongly associated with clonal reproduction: all self-incompatible species reproduce clonally, while the absence of clonality is widespread among self-compatible taxa. The observed correlation persists after taking into account shared phylogenetic history, assumptions about the evolutionary history of self-incompatibility, uncertainty associated with phylogeny estimation, and associations with life history (annual/perennial). Our results are consistent with the hypothesis that clonality provides reproductive assurance, and suggest that the consequences of clonal growth in the evolution of plant reproductive strategies may be more significant than previously thought.  相似文献   

20.
Reproduction in the genus Penicillium is thought to be completely asexual. Sexual reproduction, as occurs in the related genus Eupenicillium, is thought to provide evolutionary benefits because it allows for new combinations of alleles and therefore increases the amount of variation within the species. This hypothesis was tested using inter-simple sequence repeats (ISSRs) to assess the amount of intraspecific and intra-population variation within Penicillium miczynskii and the closely related Eupenicillium shearii. The data for both genera were also used to test for clonal reproduction against the null hypothesis of panmixis, using measures of genotypic diversity, linkage disequilibrium and phylogenetic tree length. The ISSR fingerprints indicated that the 70 Eupenicillium strains actually included two distinct species, Eupenicillium shearii and Eupenicillium tropicum sp. nov., each represented by populations in both Costa Rica and India. While none of the species or populations were found to be randomly recombining, the relative strength of the clonal component differed among the species. Penicillium miczynskii had the smallest clonal component, with the highest genotypic diversity, lowest Index of Association, 40 % of alleles non-randomly associated, and phylogenetic tree length closer to that of recombined data sets than to the minimum possible. Eupenicillium tropicum showed nearly complete clonal reproduction with the lowest genotypic diversity and 100 % of alleles non-randomly associated in both populations. On the other hand, it also had the greatest amount of intraspecific variation, with as little as 38 % similarity among strains. The results indicate that Penicilliumspecies may, on rare occasion, genetically recombine; the regular occurrence of meiosis in the life cycle of Eupenicilliumspecies does not facilitate recombination; and the greatest amount of genetic variation was not associated with recombination, but with clonal propagation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号