首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The application of phylogenetic inference methods, to data for a set of independent genes sampled randomly throughout the genome, often results in substantial incongruence in the single-gene phylogenetic estimates. Among the processes known to produce discord between single-gene phylogenies, two of the best studied in a phylogenetic context are hybridization and incomplete lineage sorting. Much recent attention has focused on the development of methods for estimating species phylogenies in the presence of incomplete lineage sorting, but phylogenetic models that allow for hybridization have been more limited. Here we propose a model that allows incongruence in single-gene phylogenies to be due to both hybridization and incomplete lineage sorting, with the goal of determining the contribution of hybridization to observed gene tree incongruence in the presence of incomplete lineage sorting. Using our model, we propose methods for estimating the extent of the role of hybridization in both a likelihood and a Bayesian framework. The performance of our methods is examined using both simulated and empirical data.  相似文献   

2.
The use of diverse data sets in phylogenetic studies aiming for understanding evolutionary histories of species can yield conflicting inference. Phylogenetic conflicts observed in animal and plant systems have often been explained by hybridization, incomplete lineage sorting (ILS), or horizontal gene transfer. Here, we used target enrichment data, species tree, and species network approaches to infer the backbone phylogeny of the family Caprifoliaceae, while distinguishing among sources of incongruence. We used 713 nuclear loci and 46 complete plastome sequence data from 43 samples representing 38 species from all major clades to reconstruct the phylogeny of the family using concatenation and coalescence approaches. We found significant nuclear gene tree conflict as well as cytonuclear discordance. Additionally, coalescent simulations and phylogenetic species network analyses suggested putative ancient hybridization among subfamilies of Caprifoliaceae, which seems to be the main source of phylogenetic discordance. Ancestral state reconstruction of six morphological characters revealed some homoplasy for each character examined. By dating the branching events, we inferred the origin of Caprifoliaceae at approximately 66.65 Ma in the late Cretaceous. By integrating evidence from molecular phylogeny, divergence times, and morphology, we here recognize Zabelioideae as a new subfamily in Caprifoliaceae. This work shows the necessity of using a combination of multiple approaches to identify the sources of gene tree discordance. Our study also highlights the importance of using data from both nuclear and plastid genomes to reconstruct deep and shallow phylogenies of plants.  相似文献   

3.
Yu Y  Degnan JH  Nakhleh L 《PLoS genetics》2012,8(4):e1002660
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa.  相似文献   

4.
The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary history of a set of genomes, or species, could be reticulate due to the occurrence of evolutionary processes such as hybridization or horizontal gene transfer. We report on a novel method for Bayesian inference of genome and species phylogenies under the multispecies network coalescent (MSNC). This framework models gene evolution within the branches of a phylogenetic network, thus incorporating reticulate evolutionary processes, such as hybridization, in addition to incomplete lineage sorting. As phylogenetic networks with different numbers of reticulation events correspond to points of different dimensions in the space of models, we devise a reversible-jump Markov chain Monte Carlo (RJMCMC) technique for sampling the posterior distribution of phylogenetic networks under MSNC. We implemented the methods in the publicly available, open-source software package PhyloNet and studied their performance on simulated and biological data. The work extends the reach of Bayesian inference to phylogenetic networks and enables new evolutionary analyses that account for reticulation.  相似文献   

5.
We examined the phylogenetic history of Linaria with special emphasis on the Mediterranean sect. Supinae (44 species). We revealed extensive highly supported incongruence among two nuclear (ITS, AGT1) and two plastid regions (rpl32-trnL(UAG), trnS-trnG). Coalescent simulations, a hybrid detection test and species tree inference in *BEAST revealed that incomplete lineage sorting and hybridization may both be responsible for the incongruent pattern observed. Additionally, we present a multilabelled *BEAST species tree as an alternative approach that allows the possibility of observing multiple placements in the species tree for the same taxa. That permitted the incorporation of processes such as hybridization within the tree while not violating the assumptions of the *BEAST model. This methodology is presented as a functional tool to disclose the evolutionary history of species complexes that have experienced both hybridization and incomplete lineage sorting. The drastic climatic events that have occurred in the Mediterranean since the late Miocene, including the Quaternary-type climatic oscillations, may have made both processes highly recurrent in the Mediterranean flora.  相似文献   

6.

Background  

The ever-increasing wealth of genomic sequence information provides an unprecedented opportunity for large-scale phylogenetic analysis. However, species phylogeny inference is obfuscated by incongruence among gene trees due to evolutionary events such as gene duplication and loss, incomplete lineage sorting (deep coalescence), and horizontal gene transfer. Gene tree parsimony (GTP) addresses this issue by seeking a species tree that requires the minimum number of evolutionary events to reconcile a given set of incongruent gene trees. Despite its promise, the use of gene tree parsimony has been limited by the fact that existing software is either not fast enough to tackle large data sets or is restricted in the range of evolutionary events it can handle.  相似文献   

7.
Incongruence among phylogenetic results has become a common occurrence in analyses of genome-scale data sets. Incongruence originates from uncertainty in underlying evolutionary processes (e.g., incomplete lineage sorting) and from difficulties in determining the best analytical approaches for each situation. To overcome these difficulties, more studies are needed that identify incongruences and demonstrate practical ways to confidently resolve them. Here, we present results of a phylogenomic study based on the analysis 197 taxa and 2,526 ultraconserved element (UCE) loci. We investigate evolutionary relationships of Eucerinae, a diverse subfamily of apid bees (relatives of honey bees and bumble bees) with >1,200 species. We sampled representatives of all tribes within the group and >80% of genera, including two mysterious South American genera, Chilimalopsis and Teratognatha. Initial analysis of the UCE data revealed two conflicting hypotheses for relationships among tribes. To resolve the incongruence, we tested concatenation and species tree approaches and used a variety of additional strategies including locus filtering, partitioned gene-trees searches, and gene-based topological tests. We show that within-locus partitioning improves gene tree and subsequent species-tree estimation, and that this approach, confidently resolves the incongruence observed in our data set. After exploring our proposed analytical strategy on eucerine bees, we validated its efficacy to resolve hard phylogenetic problems by implementing it on a published UCE data set of Adephaga (Insecta: Coleoptera). Our results provide a robust phylogenetic hypothesis for Eucerinae and demonstrate a practical strategy for resolving incongruence in other phylogenomic data sets.  相似文献   

8.
Lineage sorting has been suggested as a major force in generating incongruent phylogenetic signal when multiple gene partitions are examined. The degree of lineage sorting can be estimated using the coalescent process and simulation studies have also pointed to a major role for incomplete lineage sorting as a factor in phylogenetic inference. Some recent empirical studies point to an extreme role for this phenomenon with up to 50-60% of all informative genes showing incongruence as a result of lineage sorting. Here, we examine seven large multi-partition genome level data sets over a large range of taxonomic representation. We took the approach of examining outgroup choice and its impact on tree topology, by swapping outgroups into analyses with successively larger genetics distances to the ingroup. Our results indicate a linear relationship of outgroup distance with incongruence in the data sets we examined suggesting a strong random rooting effect. In addition, we attempted to estimate the degree of lineage sorting in several large genome level data sets by examining triads of very closely related taxa. This exercise resulted in much lower estimates of incongruent genes that could be the result of lineage sorting, with an overall estimate of around 10% of the total number of genes in a genome showing incongruence as a result of true lineage sorting. Finally we examined the behavior of likelihood and parsimony approaches on the random rooting phenomenon. Likelihood tends to stabilize incongruence as outgroups get further and further away from the ingroup. In one extreme case, likelihood overcompensates for sequence divergence but increases random rooting causing long branch repulsion.  相似文献   

9.
Estimating phylogenetic relationships among closely related species can be extremely difficult when there is incongruence among gene trees and between the gene trees and the species tree. Here we show that incorporating a model of the stochastic loss of gene lineages by genetic drift into the phylogenetic estimation procedure can provide a robust estimate of species relationships, despite widespread incomplete sorting of ancestral polymorphism. This approach is applied to a group of montane Melanoplus grasshoppers for which genealogical discordance among loci and incomplete lineage sorting obscures any obvious phylogenetic relationships among species. Unlike traditional treatments where gene trees estimated using standard phylogenetic methods are implicitly equated with the species tree, with the coalescent-based approach the species tree is modeled probabilistically from the estimated gene trees. The estimated species phylogeny (the ESP) is calculated for the grasshoppers from multiple gene trees reconstructed for nuclear loci and a mitochondrial gene. This empirical application is coupled with a simulation study to explore the performance of the coalescent-based approach. Specifically, we test the accuracy of the ESP given the data based on analyses of simulated data matching the multilocus data collected in Melanoplus (i.e., data were simulated for each locus with the same number of base pairs and locus-specific mutational models). The results of the study show that ESPs can be computed using the coalescent-based approach long before reciprocal monophyly has been achieved, and that these statistical estimates are accurate. This contrasts with analyses of the empirical data collected in Melanoplus and simulated data based on concatenation of multiple loci, for which the incomplete lineage sorting of recently diverged species posed significant problems. The strengths and potential challenges associated with incorporating an explicit model of gene-lineage coalescence into the phylogenetic procedure to obtain an ESP, as illustrated by application to Melanoplus, versus concatenation and consensus approaches are discussed. This study represents a fundamental shift in how species relationships are estimated - the relationship between the gene trees and the species phylogeny is modeled probabilistically rather than equating gene trees with a species tree.  相似文献   

10.
Reticulate evolution is a common and important driving force in angiosperm evolution. In this study, we analyzed the phylogenetic signals of genomic regions with different inheritance patterns to understand the evolutionary process of organisms using species-rich Himalaya–Hengduan taxa of bamboos (Fargesia Franchet and Yushania Keng). We constructed phylogenetic trees using different sampling strategies and reconstruction methods based on genome skimming and double digest restriction-site-associated DNA sequencing data. We assessed the congruence of topologies generated from different datasets and employed several approaches to reveal the causes of phylogenetic incongruence, including the detection of hybridization and introgression using PhyloNetworks and the D-statistic test (ABBA-BABA test). We found that, in the plastome-based phylogeny, Fargesia bamboos can be clustered into three groups and Yushania was nested within one of them, which contradicts the nuclear–double digest restriction-site-associated DNA sequencing-based phylogeny. Moreover, the genetic variation of chloroplast DNA is significantly correlated with geographical distribution. The strong signal of incomplete lineage sorting, hybridization, introgression, and cytoplasmic gene flow found among genera and species suggests that reticulate evolution is the main cause for the phylogenetic incongruence between nuclear and chloroplast datasets. Our results add evidence that genomes with different inheritance patterns can reveal distinct evolutionary histories of species and suggest that reticulate evolution is prevalent in rapidly diversifying groups.  相似文献   

11.
A species tree was reconstructed for the mainly African terrestrial orchid genus Satyrium. Separate phylogenetic analysis of both plastid and ribosomal nuclear DNA sequences for 63 species, revealed extensive topological conflict. Here we describe a detailed protocol to deal with incongruence involving three steps: identifying incongruence and testing its significance, assessing the cause of incongruence, and reconstructing the species tree. The Incongruence Length Difference test revealed that many cases of incongruence were non-significant. For the remaining significant cases, results from taxon jack-knifing experiments and parametric bootstrap suggested that non-biological artefacts such as sparse taxon sampling and long-branch attraction could be excluded as causes for the observed incongruence. In order to evaluate biological causes, such as orthology/paralogy conflation, lineage sorting, and hybridization, the number of events was counted that needs to be invoked a-posteriori to explain the observed pattern. In most cases where incongruence was significant, this resulted in a similar number of events for each of these different causes. Only for the three species from south east Asia, that form a monophyletic clade, hybridization was favoured over the alternative causes. This conclusion is based on the large number of events that needs to be invoked, in order for either orthology/paralogy conflation or lineage sorting to have been the cause of the incongruence+morphological evidence. The final species tree presented here is the product of the combined analysis of plastid and ITS sequences for all non-incongruent species and a-posteriori grafting of the incongruent clades or accessions onto the tree.  相似文献   

12.
Introgression and incomplete lineage sorting (ILS) are two of the main sources of gene‐tree incongruence; both can confound the assessment of phylogenetic relationships among closely related species. The Triatoma phyllosoma species group is a clade of partially co‐distributed and cross‐fertile Chagas disease vectors. Despite previous efforts, the phylogeny of this group remains unresolved, largely because of substantial gene‐tree incongruence. Here, we sequentially address introgression and ILS to provide a robust phylogenetic hypothesis for the T. phyllosoma species group. To identify likely instances of introgression prior to molecular scrutiny, we assessed biogeographic data and information on fertility of inter‐specific crosses. We first derived a few explicit hybridization hypotheses by considering the degree of spatial overlap within each species pair. Then, we assessed the plausibility of these hypotheses in the light of each species pair's cross‐fertility. Using this contextual information, we evaluated mito‐nuclear (cyt b, ITS‐2) gene‐tree incongruence and found evidence suggesting introgression within two species pairs. Finally, we modeled ILS using a Bayesian multispecies coalescent approach and either (a) a “complete” dataset with all the specimens in our sample, or (b) a “filtered” dataset without putatively introgressed specimens. The “filtered tree” had higher posterior‐probability support, as well as more plausible topology and divergence times, than the “complete tree.” Detecting and filtering out introgression and modeling ILS allowed us to derive an improved phylogenetic hypothesis for the T. phyllosoma species group. Our results illustrate how biogeographic and ecological‐reproductive contextual information can help clarify the systematics and evolution of recently diverged taxa prone to introgression and ILS.  相似文献   

13.
We investigate the roles of mitochondrial introgression and incomplete lineage sorting during the phylogenetic history of crotaphytid lizards. Our Bayesian phylogenetic estimate for Crotaphytidae is based on analysis of mitochondrial DNA sequence data for 408 individuals representing the 12 extant species of Crotaphytus and Gambelia. The mitochondrial phylogeny disagrees in several respects with a previously published morphological tree, as well as with conventional species designations, and we conclude that some of this disagreement stems from hybridization-mediated mitochondrial introgression, as well as from incomplete lineage sorting. Unidirectional introgression of Crotaphytus collaris (western collared lizard) mitochondria into C. reticulatus (reticulate collared lizard) populations in the Rio Grande Valley of Texas has resulted in the replacement of ancestral C. reticulatus mitochondria over approximately two-thirds of the total range of the species, a linear distance of approximately 270 km. Introgression of C. collaris mitochondria into C. bicinctores (Great Basin collared lizard) populations in southwestern Arizona requires a more complex scenario because at least three temporally separated and superimposed introgression events appear to have occurred in this region. We propose an "introgression conveyor" model to explain this unique pattern of mitochondrial variation in this region. We show with ecological niche modeling that the predicted geographical ranges of C. collaris, C. bicinctores, and C. reticulatus during glacial maxima could have provided enhanced opportunities for past hybridization. Our analyses suggest that incomplete lineage sorting and/or introgression has further confounded the phylogenetic placements of additional species including C. nebrius, C. vestigium, C. insularis, C. grismeri, and perhaps G. copei. Despite many independent instances of interspecific hybridization among crotaphytid lizards, the species continue to maintain morphological and geographic cohesiveness throughout their ranges.  相似文献   

14.
Phylogenetic networks are necessary to represent the tree of life expanded by edges to represent events such as horizontal gene transfers, hybridizations or gene flow. Not all species follow the paradigm of vertical inheritance of their genetic material. While a great deal of research has flourished into the inference of phylogenetic trees, statistical methods to infer phylogenetic networks are still limited and under development. The main disadvantage of existing methods is a lack of scalability. Here, we present a statistical method to infer phylogenetic networks from multi-locus genetic data in a pseudolikelihood framework. Our model accounts for incomplete lineage sorting through the coalescent model, and for horizontal inheritance of genes through reticulation nodes in the network. Computation of the pseudolikelihood is fast and simple, and it avoids the burdensome calculation of the full likelihood which can be intractable with many species. Moreover, estimation at the quartet-level has the added computational benefit that it is easily parallelizable. Simulation studies comparing our method to a full likelihood approach show that our pseudolikelihood approach is much faster without compromising accuracy. We applied our method to reconstruct the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), which is characterized by widespread hybridizations.  相似文献   

15.
Species complexes undergoing rapid radiation present a challenge in molecular systematics because of the possibility that ancestral polymorphism is retained in component gene trees. Coalescent theory has demonstrated that gene trees often fail to match lineage trees when taxon divergence times are less than the ancestral effective population sizes. Suggestions to increase the number of loci and the number of individuals per taxon have been proposed; however, phylogenetic methods to adequately analyze these data in a coalescent framework are scarce. We compare two approaches to estimating lineage (species) trees using multiple individuals and multiple loci: the commonly used partitioned Bayesian analysis of concatenated sequences and a modification of a newly developed hierarchical Bayesian method (BEST) that simultaneously estimates gene trees and species trees from multilocus data. We test these approaches on a phylogeny of rapidly radiating species wherein divergence times are likely to be smaller than effective population sizes, and incomplete lineage sorting is known, in the rodent genus, Thomomys. We use seven independent noncoding nuclear sequence loci (total approximately 4300 bp) and between 1 and 12 individuals per taxon to construct a phylogenetic hypothesis for eight Thomomys species. The majority-rule consensus tree from the partitioned concatenated analysis included 14 strongly supported bipartitions, corroborating monophyletic species status of five of the eight named species. The BEST tree strongly supported only the split between the two subgenera and showed very low support for any other clade. Comparison of both lineage trees to individual gene trees revealed that the concatenation method appears to ignore conflicting signals among gene trees, whereas the BEST tree considers conflicting signals and downweights support for those nodes. Bayes factor analysis of posterior tree distributions from both analyses strongly favor the model underlying the BEST analysis. This comparison underscores the risks of overreliance on results from concatenation, and ignoring the properties of coalescence, especially in cases of recent, rapid radiations.  相似文献   

16.
Sang T  Zhong Y 《Systematic biology》2000,49(3):422-434
Hybridization is an important evolutionary mechanism in plants and has been increasingly documented in animals. Difficulty in reconstruction of reticulate evolution, however, has been a long-standing problem in phylogenetics. Consequently, hybrid speciation may play a major role in causing topological incongruence between gene trees. The incongruence, in turn, offers an opportunity to detect hybrid speciation. Here we characterized certain distinctions between hybridization and other biological processes, including lineage sorting, paralogy, and lateral gene transfer, that are responsible for topological incongruence between gene trees. Consider two incongruent gene trees with three taxa, A, B, and C, where B is a sister group of A on gene tree 1 but a sister group of C on gene tree 2. With a theoretical model based on the molecular clock, we demonstrate that time of divergence of each gene between taxa A and C is nearly equal in the case of hybridization (B is a hybrid) or lateral gene transfer, but differs significantly in the case of lineage sorting or paralogy. After developing a bootstrap test to test these alternative hypotheses, we extended the model and test to account for incongruent gene trees with numerous taxa. Computer simulation studies supported the validity of the theoretical model and bootstrap test when each gene evolved at a constant rate. The computer simulation also suggested that the model remained valid as long as the rate heterogeneity was occurring proportionally in the same taxa for both genes. Although the model could not test hypotheses of hybridization versus lateral gene transfer as the cause of incongruence, these two processes may be distinguished by comparing phylogenies of multiple unlinked genes.  相似文献   

17.
One of the longstanding questions in phylogenetic systematics is how to address incongruence among phylogenies obtained from multiple markers and how to determine the causes. This study presents a detailed analysis of incongruent patterns between plastid and ITS/ETS phylogenies of Tribe Senecioneae (Asteraceae). This approach revealed widespread and strongly supported incongruence, which complicates conclusions about evolutionary relationships at all taxonomic levels. The patterns of incongruence that were resolved suggest that incomplete lineage sorting (ILS) and/or ancient hybridization are the most likely explanations. These phenomena are, however, extremely difficult to distinguish because they may result in similar phylogenetic patterns. We present a novel approach to evaluate whether ILS can be excluded as an explanation for incongruent patterns. This coalescence-based method uses molecular dating estimates of the duration of the putative ILS events to determine if invoking ILS as an explanation for incongruence would require unrealistically high effective population sizes. For four of the incongruent patterns identified within the Senecioneae, this approach indicates that ILS cannot be invoked to explain the observed incongruence. Alternatively, these patterns are more realistically explained by ancient hybridization events.  相似文献   

18.

Background

Genus Citrus (Rutaceae) comprises many important cultivated species that generally hybridize easily. Phylogenetic study of a group showing extensive hybridization is challenging. Since the genus Citrus has diverged recently (4–12 Ma), incomplete lineage sorting of ancestral polymorphisms is also likely to cause discrepancies among genes in phylogenetic inferences. Incongruence of gene trees is observed and it is essential to unravel the processes that cause inconsistencies in order to understand the phylogenetic relationships among the species.

Methodology and Principal Findings

(1) We generated phylogenetic trees using haplotype sequences of six low copy nuclear genes. (2) Published simple sequence repeat data were re-analyzed to study population structure and the results were compared with the phylogenetic trees constructed using sequence data and coalescence simulations. (3) To distinguish between hybridization and incomplete lineage sorting, we developed and utilized a coalescence simulation approach. In other studies, species trees have been inferred despite the possibility of hybridization having occurred and used to generate null distributions of the effect of lineage sorting alone (by coalescent simulation). Since this is problematic, we instead generate these distributions directly from observed gene trees. Of the six trees generated, we used the most resolved three to detect hybrids. We found that 11 of 33 samples appear to be affected by historical hybridization. Analysis of the remaining three genes supported the conclusions from the hybrid detection test.

Conclusions

We have identified or confirmed probable hybrid origins for several Citrus cultivars using three different approaches–gene phylogenies, population structure analysis and coalescence simulation. Hybridization and incomplete lineage sorting were identified primarily based on differences among gene phylogenies with reference to null expectations via coalescence simulations. We conclude that identifying hybridization as a frequent cause of incongruence among gene trees is critical to correctly infer the phylogeny among species of Citrus.  相似文献   

19.
The systematics and speciation literature is rich with discussion relating to the potential for gene tree/species tree discordance. Numerous mechanisms have been proposed to generate discordance, including differential selection, long-branch attraction, gene duplication, genetic introgression, and/or incomplete lineage sorting. For speciose clades in which divergence has occurred recently and rapidly, recovering the true species tree can be particularly problematic due to incomplete lineage sorting. Unfortunately, the availability of multilocus or "phylogenomic" data sets does not simply solve the problem, particularly when the data are analyzed with standard concatenation techniques. In our study, we conduct a phylogenetic study for a nearly complete species sample of the dwarf and mouse lemur clade, Cheirogaleidae. Mouse lemurs (genus, Microcebus) have been intensively studied over the past decade for reasons relating to their high level of cryptic species diversity, and although there has been emerging consensus regarding the evolutionary diversity contained within the genus, there is no agreement as to the inter-specific relationships within the group. We attempt to resolve cheirogaleid phylogeny, focusing especially on the mouse lemurs, by employing a large multilocus data set. We compare the results of Bayesian concordance methods with those of standard gene concatenation, finding that though concatenation yields the strongest results as measured by statistical support, these results are found to be highly misleading. By employing an approach where individual alleles are treated as operational taxonomic units, we show that phylogenetic results are substantially influenced by the selection of alleles in the concatenation process.  相似文献   

20.
The phylogenetic relationship of the now fully sequenced species Drosophila erecta and D. yakuba with respect to the D. melanogaster species complex has been a subject of controversy. All three possible groupings of the species have been reported in the past, though recent multi-gene studies suggest that D. erecta and D. yakuba are sister species. Using the whole genomes of each of these species as well as the four other fully sequenced species in the subgenus Sophophora, we set out to investigate the placement of D. erecta and D. yakuba in the D. melanogaster species group and to understand the cause of the past incongruence. Though we find that the phylogeny grouping D. erecta and D. yakuba together is the best supported, we also find widespread incongruence in nucleotide and amino acid substitutions, insertions and deletions, and gene trees. The time inferred to span the two key speciation events is short enough that under the coalescent model, the incongruence could be the result of incomplete lineage sorting. Consistent with the lineage-sorting hypothesis, substitutions supporting the same tree were spatially clustered. Support for the different trees was found to be linked to recombination such that adjacent genes support the same tree most often in regions of low recombination and substitutions supporting the same tree are most enriched roughly on the same scale as linkage disequilibrium, also consistent with lineage sorting. The incongruence was found to be statistically significant and robust to model and species choice. No systematic biases were found. We conclude that phylogenetic incongruence in the D. melanogaster species complex is the result, at least in part, of incomplete lineage sorting. Incomplete lineage sorting will likely cause phylogenetic incongruence in many comparative genomics datasets. Methods to infer the correct species tree, the history of every base in the genome, and comparative methods that control for and/or utilize this information will be valuable advancements for the field of comparative genomics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号