首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.

Background  

The ever-increasing wealth of genomic sequence information provides an unprecedented opportunity for large-scale phylogenetic analysis. However, species phylogeny inference is obfuscated by incongruence among gene trees due to evolutionary events such as gene duplication and loss, incomplete lineage sorting (deep coalescence), and horizontal gene transfer. Gene tree parsimony (GTP) addresses this issue by seeking a species tree that requires the minimum number of evolutionary events to reconcile a given set of incongruent gene trees. Despite its promise, the use of gene tree parsimony has been limited by the fact that existing software is either not fast enough to tackle large data sets or is restricted in the range of evolutionary events it can handle.  相似文献   

2.
Two different methods of using paralogous genes for phylogenetic inference have been proposed: reconciled trees (or gene tree parsimony) and uninode coding. Gene tree parsimony suffers from 10 serious problems, including differential weighting of nucleotide and gap characters, undersampling which can be misinterpreted as synapomorphy, all of the characters not being allowed to interact, and conflict between gene trees being given equal weight, regardless of branch support. These problems are largely avoided by using uninode coding. The uninode coding method is elaborated to address multiple gene duplications within a single gene tree family and handle problems caused by lack of gene tree resolution. An example of vertebrate phylogeny inferred from nine genes is reanalyzed using uninode coding. We suggest that uninode coding be used instead of gene tree parsimony for phylogenetic inference from paralogous genes.  相似文献   

3.
Most plant phylogenetic inference has used DNA sequence data from the plastid genome. This genome represents a single genealogical sample with no recombination among genes, potentially limiting the resolution of evolutionary relationships in some contexts. In contrast, nuclear DNA is inherently more difficult to employ for phylogeny reconstruction because major mutational events in the genome, including polyploidization, gene duplication, and gene extinction can result in homologous gene copies that are difficult to identify as orthologs or paralogs. Gene tree parsimony (GTP) can be used to infer the rooted species tree by fitting gene genealogies to species trees while simultaneously minimizing the estimated number of duplications needed to reconcile conflicts among them. Here, we use GTP for five nuclear gene families and a previously published plastid data set to reconstruct the phylogenetic backbone of the aquatic plant family Pontederiaceae. Plastid-based phylogenetic studies strongly supported extensive paraphyly of Eichhornia (one of the four major genera) but also depicted considerable ambiguity concerning the true root placement for the family. Our results indicate that species trees inferred from the nuclear genes (alone and in combination with the plastid data) are highly congruent with gene trees inferred from plastid data alone. Consideration of optimal and suboptimal gene tree reconciliations place the root of the family at (or near) a branch leading to the rare and locally restricted E. meyeri. We also explore methods to incorporate uncertainty in individual gene trees during reconciliation by considering their individual bootstrap profiles and relate inferred excesses of gene duplication events on individual branches to whole-genome duplication events inferred for the same branches. Our study improves understanding of the phylogenetic history of Pontederiaceae and also demonstrates the utility of GTP for phylogenetic analysis.  相似文献   

4.
Jackrabbits and hares, members of the genus Lepus, comprise over half of the species within the family Leporidae (Lagomorpha). Despite their ecological importance, potential economic impact, and worldwide distribution, the evolution of hares and jackrabbits has been poorly studied. We provide an initial phylogenetic framework for jackrabbits and hares so that explicit hypotheses about their evolution can be developed and tested. To this end, we have collected DNA sequence data from a 702-bp region of the mitochondrial cytochrome b gene and reconstructed the evolutionary history (via parsimony, neighbor joining, and maximum likelihood) of 11 species of Lepus, focusing on North American taxa. Due to problems of saturation, induced by multiple substitutions, at synonymous coding positions between the ingroup taxa and the outgroups (Oryctolagus and Sylvilagus), both rooted and unrooted trees were examined. Variation in tree topologies generated by different reconstruction methods was observed in analyses including the outgroups, but not in the analyses of unrooted ingroup networks. Apparently, substitutional saturation hindered the analyses when outgroups were considered. The trees based on the cytochrome b data indicate that the taxonomic status of some species needs to be reassessed and that species of Lepus within North America do not form a monophyletic entity.  相似文献   

5.
Phylogenetic analyses using genome-scale data sets must confront incongruence among gene trees, which in plants is exacerbated by frequent gene duplications and losses. Gene tree parsimony (GTP) is a phylogenetic optimization criterion in which a species tree that minimizes the number of gene duplications induced among a set of gene trees is selected. The run time performance of previous implementations has limited its use on large-scale data sets. We used new software that incorporates recent algorithmic advances to examine the performance of GTP on a plant data set consisting of 18,896 gene trees containing 510,922 protein sequences from 136 plant taxa (giving a combined alignment length of >2.9 million characters). The relationships inferred from the GTP analysis were largely consistent with previous large-scale studies of backbone plant phylogeny and resolved some controversial nodes. The placement of taxa that were present in few gene trees generally varied the most among GTP bootstrap replicates. Excluding these taxa either before or after the GTP analysis revealed high levels of phylogenetic support across plants. The analyses supported magnoliids sister to a eudicot + monocot clade and did not support the eurosid I and II clades. This study presents a nuclear genomic perspective on the broad-scale phylogenic relationships among plants, and it demonstrates that nuclear genes with a history of duplication and loss can be phylogenetically informative for resolving the plant tree of life.  相似文献   

6.
Phylogenetic trees from multiple genes can be obtained in two fundamentally different ways. In one, gene sequences are concatenated into a super-gene alignment, which is then analyzed to generate the species tree. In the other, phylogenies are inferred separately from each gene, and a consensus of these gene phylogenies is used to represent the species tree. Here, we have compared these two approaches by means of computer simulation, using 448 parameter sets, including evolutionary rate, sequence length, base composition, and transition/transversion rate bias. In these simulations, we emphasized a worst-case scenario analysis in which 100 replicate datasets for each evolutionary parameter set (gene) were generated, and the replicate dataset that produced a tree topology showing the largest number of phylogenetic errors was selected to represent that parameter set. Both randomly selected and worst-case replicates were utilized to compare the consensus and concatenation approaches primarily using the neighbor-joining (NJ) method. We find that the concatenation approach yields more accurate trees, even when the sequences concatenated have evolved with very different substitution patterns and no attempts are made to accommodate these differences while inferring phylogenies. These results appear to hold true for parsimony and likelihood methods as well. The concatenation approach shows >95% accuracy with only 10 genes. However, this gain in accuracy is sometimes accompanied by reinforcement of certain systematic biases, resulting in spuriously high bootstrap support for incorrect partitions, whether we employ site, gene, or a combined bootstrap resampling approach. Therefore, it will be prudent to report the number of individual genes supporting an inferred clade in the concatenated sequence tree, in addition to the bootstrap support.  相似文献   

7.
8.
The series Staphyliniformia is one of the mega‐diverse groups of Coleoptera, but the relationships among the main families are still poorly understood. In this paper we address the interrelationships of staphyliniform groups, with special emphasis on Hydrophiloidea and Hydraenidae, based on partial sequences of the ribosomal genes 18S rDNA and 28S rDNA. Sequence data were analysed with parsimony and Bayesian posterior probabilities, in an attempt to overcome the likely effect of some branches longer than the 95% cumulative probability of the estimated normal distribution of the path lengths of the species. The inter‐family relationships in the trees obtained with both methods were in general poorly supported, although most of the results based on the sequence data are in good agreement with morphological studies. In none of our analyses a close relationship between Hydraenidae and Hydrophiloidea was supported, contrary to the traditional view but in agreement with recent morphological investigations. Hydraenidae form a clade with Ptiliidae and Scydmaenidae in the tree obtained with Bayesian probabilities, but are placed as basal group of Staphyliniformia (with Silphidae as subordinate group) in the parsimony tree. Based on the analysed data with a limited set of outgroups Scarabaeoidea are nested within Staphyliniformia. However, this needs further support. Hydrophiloidea s.str., Sphaeridiinae, Histeroidea (Histeridae + Sphaeritidae), and all staphylinoid families included are confirmed as monophyletic, with the exception of Hydraenidae in the parsimony tree. Spercheidae are not a basal group within Hydrophiloidea, as has been previously suggested, but included in a polytomy with other Hydrophilidae in the Bayesian analyses, or its sistergroup (with the inclusion of Epimetopidae) in the parsimony tree. Helophorus is placed at the base of Hydrophiloidea in the parsimony tree. The monophyly of Hydrophiloidea s.l. (including the histeroid families) and Staphylinoidea could not be confirmed by the analysed data. Some results, such as a placement of Silphidae as subordinate group of Hydraenidae (parsimony tree), or a sistergroup relationship between Ptiliidae and Scydmaenidae, appear unlikely from a morphological point of view.  相似文献   

9.
Recent computational advances provide novel opportunities to infer species trees based on multiple independent loci. Thus, single gene trees no longer need suffice as proxies for species phylogenies. Several methods have been developed to deal with the challenges posed by incomplete and stochastic lineage sorting. In this study, we employed four Bayesian methods to infer the phylogeny of a clade of 11 recently diverged oriole species within the genus Icterus. We obtained well-resolved and mostly congruent phylogenies using a set of seven unlinked nuclear intron loci and sampling multiple individuals per species. Most notably, Bayesian concordance analysis generally agreed well with concatenation; the two methods agreed fully on eight of nine nodes. The coalescent-based method BEAST further supported six of these eight nodes. The fourth method used, BEST, failed to converge despite exhaustive efforts to optimize the tree search. Overall, the results obtained by new species tree methods and concatenation generally corroborate our findings from previous analyses and data sets. However, we found striking disagreement between mitochondrial and nuclear DNA involving relationships within the northern oriole group. Our results highlight the danger of reliance on mtDNA alone for phylogenetic inference. We demonstrate that in spite of low variability and incomplete lineage sorting, multiple nuclear loci can produce largely congruent phylogenies based on multiple species tree methods, even for very closely-related species.  相似文献   

10.
DupTree is a new software program for inferring rooted species trees from collections of gene trees using the gene tree parsimony approach. The program implements a novel algorithm that significantly improves upon the run time of standard search heuristics for gene tree parsimony, and enables the first truly genome-scale phylogenetic analyses. In addition, DupTree allows users to examine alternate rootings and to weight the reconciliation costs for gene trees. DupTree is an open source project written in C++. Availability: DupTree for Mac OS X, Windows, and Linux along with a sample dataset and an on-line manual are available at http://genome.cs.iastate.edu/CBL/DupTree  相似文献   

11.
While most bark beetles attack only dead or weakened trees, many species in the genus Dendroctonus have the ability to kill healthy conifers through mass attack of the host tree, and can exhibit devastating outbreaks. Other species in this group are able to successfully colonize trees in small numbers without killing the host. We reconstruct the evolution of these ecological and life history traits, first classifying the extant Dendroctonus species by attack type (mass or few), outbreaks (yes or no), host genus (Pinus and others), location of attacks on the tree (bole, base, etc.), whether the host is killed (yes or no), and if the larvae are gregarious or have individual galleries (yes or no). We then estimated a molecular phylogeny for a data set of cytochrome oxidase I sequences sampled from nearly all Dendroctonus species, and used this phylogeny to reconstruct the ancestral state at various nodes on the tree, employing maximum parsimony, maximum likelihood, and Bayesian methods. Our reconstructions suggest that extant Dendroctonus species likely evolved from an ancestor that killed host pines through mass attack of the bole, had individual larvae, and exhibited outbreaks. The ability to colonize a host tree in small numbers (as well as gregarious larvae and attacks at the tree base) apparently evolved later, possibly as two separate events in different clades. It is likely that tree mortality and outbreaks have been continuing features of the interaction between conifers and Dendroctonus bark beetles.  相似文献   

12.
Choosing among alternative trees of multigene families   总被引:4,自引:0,他引:4  
Estimation of gene trees is the first step in testing alternative hypotheses about the evolution of multigene families. The standard practice for inferring gene family history is to construct trees that meet some objective criteria based on the fit of the character state changes (nucleotide or amino acid changes) to the gene tree. Unfortunately, analysis of character state data can be misleading. In addition, this approach ignores information about the relationships of the species from which the genes have been sampled. In this paper I explore using statistics of fit between the character data and gene trees and the reconciliation of the gene and species trees for choosing among alternative evolutionary hypotheses of gene families. In particular, I advocate a two-pronged strategy for choosing among alternative gene trees. First, the character data are used to define a set of acceptable gene trees (i.e., trees that are not significantly different from the minimum length tree). Next, the set of acceptable gene trees is reconciled with a known species tree, and the gene tree requiring the fewest number of gene duplications and losses is adopted as the best estimate of evolutionary history. The approach is illustrated using three gene families: BMP, EGR, and LDH.  相似文献   

13.
14.
To improve our understanding of phylogenetic relationships within the anamorphic genus Septoria, three molecular data sets representing 2,417 bp of nuclear and mitochondrial genes were evaluated. Separate gene analyses and combined analyses were performed using first, the maximum parsimony criterion and second, a Bayesian framework. The homogeneity of data partitions was evaluated via a combination of homogeneity partition tests and tree topology incongruence tests before conducting combined analyses. A last incongruence re-evaluation using partitioned Bremer support was performed on the combined tree, which corroborated the previous estimates. After each separate data set attributes were examined, simple explanations were advocated as the causes of the significant incongruences detected. The analysis of multiple gene partitions showed unprecedented phylogenetic resolution within the genus Septoria that supported the results from previously published single gene phylogenies. Specifically, we have delimited distinct but closely related species representing monophyletic groups that frequently correlated with their respective host families. Conversely, the occurrence of well-supported groups including closely related but distinct molecular taxa sampled on unrelated host-plants allowed us to reject, in these particular cases, the co-evolutionary concept expected between a parasite and its host and to discuss alternative evolutionary models recently proposed for these pathogens.  相似文献   

15.
We considered the contribution of two mitochondrial and two nuclear data sets for the phylogenetic reconstruction of 22 species of seed beetles in the genus Curculio (Coleoptera: Cuculionidae). A phylogenetic tree from representatives found on various hosts was inferred from a combined data set of mitochondrial DNA cytochrome oxidase subunit I, mitochondrial cytochrome b, nuclear elongation factor 1alpha, and nuclear phosphoglycerate mutase, used for the first time as a molecular marker. Separate parsimony analyses of each data set showed that individual gene trees were mainly congruent and often complementary in the support of clades but the analysis was complicated by failure of PCR amplification of nuclear genes for many taxa and hence missing data entries. When the four gene partitions were combined in a simultaneous analysis despite the missing data, this increased the resolution and taxonomic coverage compared to the individual source trees. Alternative approaches of combining the information via supertree methodology produced a comparatively less resolved tree, and hence seem inferior to combining data matrices even in cases where numerous taxa are missing. The molecular data suggest a classification of the European species into two species groups that are in accordance with morphological characteristics but the data do no support any of the previously recognised American species groups.  相似文献   

16.
Liu L  Pearl DK 《Systematic biology》2007,56(3):504-514
The desire to infer the evolutionary history of a group of species should be more viable now that a considerable amount of multilocus molecular data is available. However, the current molecular phylogenetic paradigm still reconstructs gene trees to represent the species tree. Further, commonly used methods of combining data, such as the concatenation method, are known to be inconsistent in some circumstances. In this paper, we propose a Bayesian hierarchical model to estimate the phylogeny of a group of species using multiple estimated gene tree distributions, such as those that arise in a Bayesian analysis of DNA sequence data. Our model employs substitution models used in traditional phylogenetics but also uses coalescent theory to explain genealogical signals from species trees to gene trees and from gene trees to sequence data, thereby forming a complete stochastic model to estimate gene trees, species trees, ancestral population sizes, and species divergence times simultaneously. Our model is founded on the assumption that gene trees, even of unlinked loci, are correlated due to being derived from a single species tree and therefore should be estimated jointly. We apply the method to two multilocus data sets of DNA sequences. The estimates of the species tree topology and divergence times appear to be robust to the prior of the population size, whereas the estimates of effective population sizes are sensitive to the prior used in the analysis. These analyses also suggest that the model is superior to the concatenation method in fitting these data sets and thus provides a more realistic assessment of the variability in the distribution of the species tree that may have produced the molecular information at hand. Future improvements of our model and algorithm should include consideration of other factors that can cause discordance of gene trees and species trees, such as horizontal transfer or gene duplication.  相似文献   

17.
Gene family evolution is determined by microevolutionary processes (e.g., point mutations) and macroevolutionary processes (e.g., gene duplication and loss), yet macroevolutionary considerations are rarely incorporated into gene phylogeny reconstruction methods. We present a dynamic program to find the most parsimonious gene family tree with respect to a macroevolutionary optimization criterion, the weighted sum of the number of gene duplications and losses. The existence of a polynomial delay algorithm for duplication/loss phylogeny reconstruction stands in contrast to most formulations of phylogeny reconstruction, which are NP-complete. We next extend this result to obtain a two-phase method for gene tree reconstruction that takes both micro- and macroevolution into account. In the first phase, a gene tree is constructed from sequence data, using any of the previously known algorithms for gene phylogeny construction. In the second phase, the tree is refined by rearranging regions of the tree that do not have strong support in the sequence data to minimize the duplication/lost cost. Components of the tree with strong support are left intact. This hybrid approach incorporates both micro- and macroevolutionary considerations, yet its computational requirements are modest in practice because the two-phase approach constrains the search space. Our hybrid algorithm can also be used to resolve nonbinary nodes in a multifurcating gene tree. We have implemented these algorithms in a software tool, NOTUNG 2.0, that can be used as a unified framework for gene tree reconstruction or as an exploratory analysis tool that can be applied post hoc to any rooted tree with bootstrap values. The NOTUNG 2.0 graphical user interface can be used to visualize alternate duplication/loss histories, root trees according to duplication and loss parsimony, manipulate and annotate gene trees, and estimate gene duplication times. It also offers a command line option that enables high-throughput analysis of a large number of trees.  相似文献   

18.
Species complexes undergoing rapid radiation present a challenge in molecular systematics because of the possibility that ancestral polymorphism is retained in component gene trees. Coalescent theory has demonstrated that gene trees often fail to match lineage trees when taxon divergence times are less than the ancestral effective population sizes. Suggestions to increase the number of loci and the number of individuals per taxon have been proposed; however, phylogenetic methods to adequately analyze these data in a coalescent framework are scarce. We compare two approaches to estimating lineage (species) trees using multiple individuals and multiple loci: the commonly used partitioned Bayesian analysis of concatenated sequences and a modification of a newly developed hierarchical Bayesian method (BEST) that simultaneously estimates gene trees and species trees from multilocus data. We test these approaches on a phylogeny of rapidly radiating species wherein divergence times are likely to be smaller than effective population sizes, and incomplete lineage sorting is known, in the rodent genus, Thomomys. We use seven independent noncoding nuclear sequence loci (total approximately 4300 bp) and between 1 and 12 individuals per taxon to construct a phylogenetic hypothesis for eight Thomomys species. The majority-rule consensus tree from the partitioned concatenated analysis included 14 strongly supported bipartitions, corroborating monophyletic species status of five of the eight named species. The BEST tree strongly supported only the split between the two subgenera and showed very low support for any other clade. Comparison of both lineage trees to individual gene trees revealed that the concatenation method appears to ignore conflicting signals among gene trees, whereas the BEST tree considers conflicting signals and downweights support for those nodes. Bayes factor analysis of posterior tree distributions from both analyses strongly favor the model underlying the BEST analysis. This comparison underscores the risks of overreliance on results from concatenation, and ignoring the properties of coalescence, especially in cases of recent, rapid radiations.  相似文献   

19.

Background

Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogenetic studies, such as those available in large EST libraries. One rarely used method of inference, gene tree parsimony, can infer species trees from gene families undergoing duplication and loss, but its performance has not been evaluated at a phylogenomic scale for EST data in plants.

Results

A gene tree parsimony analysis based on EST data was undertaken for six angiosperm model species and Pinus, an outgroup. Although a large fraction of the tentative consensus sequences obtained from the TIGR database of ESTs was assembled into homologous clusters too small to be phylogenetically informative, some 557 clusters contained promising levels of information. Based on maximum likelihood estimates of the gene trees obtained from these clusters, gene tree parsimony correctly inferred the accepted species tree with strong statistical support. A slight variant of this species tree was obtained when maximum parsimony was used to infer the individual gene trees instead.

Conclusion

Despite the complexity of the EST data and the relatively small fraction eventually used in inferring a species tree, the gene tree parsimony method performed well in the face of very high apparent rates of duplication.
  相似文献   

20.
本文将12S rRNA基因序列分析应用于研究若干重要蜘蛛类群的系统关系,以对传统的分类研究结论进行验证和补充,并且探讨12S rRNA基因序列分析在蜘蛛系统发生研究中的适用性。根据12S rRNA基因第3结构域构建的分子系统树得出结论:1.圆网类(即妖面蛛总科与园蛛总科)并非单系;2.隙蛛与暗蛛较漏斗蛛具有更近的亲缘关系;3.壁钱和拟壁钱并不近缘;4.有筛器类蜘蛛为复系类群;5.12S rRNA基因第3结构域片段对推断近缘科属间的系统发生关系是有效的遗传标记。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号