首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The main limiting factor in Bayesian MCMC analysis of phylogeny is typically the efficiency with which topology proposals sample tree space. Here we evaluate the performance of seven different proposal mechanisms, including most of those used in current Bayesian phylogenetics software. We sampled 12 empirical nucleotide data sets--ranging in size from 27 to 71 taxa and from 378 to 2,520 sites--under difficult conditions: short runs, no Metropolis-coupling, and an oversimplified substitution model producing difficult tree spaces (Jukes Cantor with equal site rates). Convergence was assessed by comparison to reference samples obtained from multiple Metropolis-coupled runs. We find that proposals producing topology changes as a side effect of branch length changes (LOCAL and Continuous Change) consistently perform worse than those involving stochastic branch rearrangements (nearest neighbor interchange, subtree pruning and regrafting, tree bisection and reconnection, or subtree swapping). Among the latter, moves that use an extension mechanism to mix local with more distant rearrangements show better overall performance than those involving only local or only random rearrangements. Moves with only local rearrangements tend to mix well but have long burn-in periods, whereas moves with random rearrangements often show the reverse pattern. Combinations of moves tend to perform better than single moves. The time to convergence can be shortened considerably by starting with a good tree, but this comes at the cost of compromising convergence diagnostics based on overdispersed starting points. Our results have important implications for developers of Bayesian MCMC implementations and for the large group of users of Bayesian phylogenetics software.  相似文献   

2.
Principal components analysis (PCA) and hierarchical clustering are two of the most heavily used techniques for analyzing the differences between nucleic acid sequence samples taken from a given environment. They have led to many insights regarding the structure of microbial communities. We have developed two new complementary methods that leverage how this microbial community data sits on a phylogenetic tree. Edge principal components analysis enables the detection of important differences between samples that contain closely related taxa. Each principal component axis is a collection of signed weights on the edges of the phylogenetic tree, and these weights are easily visualized by a suitable thickening and coloring of the edges. Squash clustering outputs a (rooted) clustering tree in which each internal node corresponds to an appropriate “average” of the original samples at the leaves below the node. Moreover, the length of an edge is a suitably defined distance between the averaged samples associated with the two incident nodes, rather than the less interpretable average of distances produced by UPGMA, the most widely used hierarchical clustering method in this context. We present these methods and illustrate their use with data from the human microbiome.  相似文献   

3.
The parsimony score of a character on a tree equals the number of state changes required to fit that character onto the tree. We show that for unordered, reversible characters this score equals the number of tree rearrangements required to fit the tree onto the character. We discuss implications of this connection for the debate over the use of consensus trees or total evidence and show how it provides a link between incongruence of characters and recombination.  相似文献   

4.
The problem of reconstructing the duplication history of a set of tandemly repeated sequences was first introduced by Fitch (1977). Many recent studies deal with this problem, showing the validity of the unequal recombination model proposed by Fitch, describing numerous inference algorithms, and exploring the combinatorial properties of these new mathematical objects, which are duplication trees. In this paper, we deal with the topological rearrangement of these trees. Classical rearrangements used in phylogeny (NNI, SPR, TBR, ...) cannot be applied directly on duplication trees. We show that restricting the neighborhood defined by the SPR (Subtree Pruning and Regrafting) rearrangement to valid duplication trees, allows exploring the whole duplication tree space. We use these restricted rearrangements in a local search method which improves an initial tree via successive rearrangements. This method is applied to the optimization of parsimony and minimum evolution criteria. We show through simulations that this method improves all existing programs for both reconstructing the topology of the true tree and recovering its duplication events. We apply this approach to tandemly repeated human Zinc finger genes and observe that a much better duplication tree is obtained by our method than using any other program.  相似文献   

5.
The phylogeny of the thrushes (Aves: Turdus) has been difficult to reconstruct due to short internal branches and lack of node support for certain parts of the tree. Reconstructing the biogeographic history of this group is further complicated by the fact that current implementations of biogeographic methods, such as dispersal-vicariance analysis (DIVA; Ronquist, 1997), require a fully resolved tree. Here, we apply a Bayesian approach to dispersal-vicariance analysis that accounts for phylogenetic uncertainty and allows a more accurate analysis of the biogeographic history of lineages. Specifically, ancestral area reconstructions can be presented as marginal distributions, thus displaying the underlying topological uncertainty. Moreover, if there are multiple optimal solutions for a single node on a certain tree, integrating over the posterior distribution of trees often reveals a preference for a narrower set of solutions. We find that despite the uncertainty in tree topology, ancestral area reconstructions indicate that the Turdus clade originated in the eastern Palearctic during the Late Miocene. This was followed by an early dispersal to Africa from where a worldwide radiation took place. The uncertainty in tree topology and short branch lengths seems to indicate that this radiation took place within a limited time span during the Late Pliocene. The results support the role of Africa as a probable source area for intercontinental dispersals as suggested for other passerine groups, including basal diversification within the songbird tree.  相似文献   

6.
MOTIVATION:Aligning multiple proteins based on sequence information alone is challenging if sequence identity is low or there is a significant degree of structural divergence. We present a novel algorithm (SATCHMO) that is designed to address this challenge. SATCHMO simultaneously constructs a tree and a set of multiple sequence alignments, one for each internal node of the tree. The alignment at a given node contains all sequences within its sub-tree, and predicts which positions in those sequences are alignable and which are not. Aligned regions therefore typically get shorter on a path from a leaf to the root as sequences diverge in structure. Current methods either regard all positions as alignable (e.g. ClustalW), or align only those positions believed to be homologous across all sequences (e.g. profile HMM methods); by contrast SATCHMO makes different predictions of alignable regions in different subgroups. SATCHMO generates profile hidden Markov models at each node; these are used to determine branching order, to align sequences and to predict structurally alignable regions. RESULTS: In experiments on the BAliBASE benchmark alignment database, SATCHMO is shown to perform comparably to ClustalW and the UCSC SAM HMM software. Results using SATCHMO to identify protein domains are demonstrated on potassium channels, with implications for the mechanism by which tumor necrosis factor alpha affects potassium current. AVAILABILITY: The software is available for download from http://www.drive5.com/lobster/index.htm  相似文献   

7.
Previous molecular phylogeny algorithms mainly rely onmulti-sequence alignments of cautiously selected characteristic sequences,thus not directly appropriate for whole genome phylogeny where eventssuch as rearrangements make full-length alignments impossible. Weintroduce here the concept of Complete Information Set (CIS) and itsmeasurement implementation as evolution distance without reference tosizes. As method proof-test, the 16s rRNA sequences of 22 completelysequenced Bacteria and Archaea species are used to reconstruct aphylogenetic tree, which is generally consistent with the commonlyaccepted one. Based on whole genome, our further efforts yield a highlyrobust whole genome phylogenetic tree, supporting separate monophyleticcluster of species with similar phenotype as well as the early evolution ofthermophilic Bacteria and late diverging of Eukarya. The purpose of thiswork is not to contradict or confirm previous phylogeny standards butrather to bring a brand-new algorithm and tool to the phylogeny researchcommunity. The software to estimate the sequence distance and materialsused in this study are available upon request to corresponding author.  相似文献   

8.
treegraph assists in producing complex ready‐to‐publish figures of phylogenetic trees. The TGF format used by the program automates formatting of several different statistical support value types (confidence estimates) per tree node. Moreover, internal text and graphical labels are automatically arranged at the nodes as are annotations for clades or groups of terminals. treegraph imports nexus trees and related file formats. Beyond common tree edit operations, simultaneous pruning of subtrees (simplification of the tree to higher order clades) and saving of subtrees is possible. treegraph exports to the standard vector graphics formats Scalable Vector Graphics and PostScript.  相似文献   

9.
Phylogenetic reconstruction based on sequence variation in the internal transcribed spacer (ITS) region of rDNA was used to investigate the evolutionary dynamics of homomorphic self-incompatibility in Linanthus section Leptosiphon (Polemoniaceae), a group of annual plant species. Hand-pollination experiments revealed that five species were self-incompatible and four were self-compatible. Optimization of breeding systems onto the tree resulting from maximum-likelihood analysis, with no assumptions made about the ancestral condition, indicated that self-incompatibility has been lost four times in this section. An alternative tree rearrangement conforming to the hypothesis of three losses of self-incompatibility did not have a significantly lower likelihood than the maximum-likelihood tree as determined by a paired-sites test, but all rearrangements resulting in fewer than three losses were statistically rejected. Linanthus bicolor, a selfing species, was found to be polyphyletic, with populations from different geographic regions occurring in three well-supported clades. Morphological similarity in these distinct lineages is likely to have resulted from convergent evolution of traits associated with self-fertilization. Selection for reproductive assurance is hypothesized to have played an important role in the recurrent transformations from self-incompatibility to selfing in this group of annual species.  相似文献   

10.
Dating the origin of Placentalia has been a contentious issue for biologists and paleontologists. Although it is likely that crown‐group placentals originated in the Late Cretaceous, nearly all molecular clock estimates point to a deeper Cretaceous origin. An approach with the potential to reconcile this discrepancy could be the application of a morphological clock. This would permit the direct incorporation of fossil data in node dating, and would break long internal branches of the tree, so leading to improved estimates of node ages. Here, we use a large morphological dataset and the tip‐calibration approach of MrBayes. We find that the estimated date for the origin of crown mammals is much older, ~130–145 million years ago (Ma), than fossil and molecular clock data (~80–90 Ma). Our results suggest that tip calibration may result in estimated dates that are more ancient than those obtained from other sources of data. This can be partially overcome by constraining the ages of internal nodes on the tree; however, when this was applied to our dataset, the estimated dates were still substantially more ancient than expected. We recommend that results obtained using tip calibration, and possibly morphological dating more generally, should be treated with caution.  相似文献   

11.
GRIMM: genome rearrangements web server   总被引:14,自引:0,他引:14  
SUMMARY: Genome Rearrangements In Man and Mouse (GRIMM) is a tool for analyzing rearrangements of gene orders in pairs of unichromosomal and multichromosomal genomes, with either signed or unsigned gene data. Although there are several programs for analyzing rearrangements in unichromosomal genomes, this is the first to analyze rearrangements in multichromosomal genomes. GRIMM also provides a new algorithm for analyzing comparative maps for which gene directions are unknown. AVAILABILITY: A web server, with instructions and sample data, is available at http://www-cse.ucsd.edu/groups/bioinformatics/GRIMM.  相似文献   

12.
Shi G  Peng MC  Jiang T 《PloS one》2011,6(6):e20892
The identification of orthologous genes shared by multiple genomes plays an important role in evolutionary studies and gene functional analyses. Based on a recently developed accurate tool, called MSOAR 2.0, for ortholog assignment between a pair of closely related genomes based on genome rearrangement, we present a new system MultiMSOAR 2.0, to identify ortholog groups among multiple genomes in this paper. In the system, we construct gene families for all the genomes using sequence similarity search and clustering, run MSOAR 2.0 for all pairs of genomes to obtain the pairwise orthology relationship, and partition each gene family into a set of disjoint sets of orthologous genes (called super ortholog groups or SOGs) such that each SOG contains at most one gene from each genome. For each such SOG, we label the leaves of the species tree using 1 or 0 to indicate if the SOG contains a gene from the corresponding species or not. The resulting tree is called a tree of ortholog groups (or TOGs). We then label the internal nodes of each TOG based on the parsimony principle and some biological constraints. Ortholog groups are finally identified from each fully labeled TOG. In comparison with a popular tool MultiParanoid on simulated data, MultiMSOAR 2.0 shows significantly higher prediction accuracy. It also outperforms MultiParanoid, the Roundup multi-ortholog repository and the Ensembl ortholog database in real data experiments using gene symbols as a validation tool. In addition to ortholog group identification, MultiMSOAR 2.0 also provides information about gene births, duplications and losses in evolution, which may be of independent biological interest. Our experiments on simulated data demonstrate that MultiMSOAR 2.0 is able to infer these evolutionary events much more accurately than a well-known software tool Notung. The software MultiMSOAR 2.0 is available to the public for free.  相似文献   

13.
 Roguing and replanting is a widely adopted control strategy of infectious diseases in orchards. Little is known about the effect of this type of management on the dynamics of the infectious disease. In this paper we analyze a structured population model for the dynamics of an S-I-R type epidemic under roguing and replanting management. The model is structured with respect to the total number of infections and the number of post-infectious infections on a tree. Trees are assumed to be rogued, and replaced by uninfected trees, when the total number of infections on the tree reaches a threshold value. Stability analysis and numerical exploration of the model show that for specific parameter combinations the internal equilibrium can become unstable and large amplitude periodic fluctuations arise. Several hypothesis on the mechanism causing the destabilisation of the steady-state are considered. The mechanism leading to the large amplitude fluctuations is identified and biologically interpreted. Received 2 September 1994  相似文献   

14.
15.
The imbalance of a node in a phylogenetic tree can be defined in terms of the relative numbers of species (or higher taxa) on the branches that originate at the node. Empirically, imbalance also turns out to depend on the absolute total number of species on the branches: in a sample of large trees, nodes with more descendent species tend to be more unbalanced. Subsidiary analyses suggest that this pattern is not a result of errors in tree estimation. Instead, the increase in imbalance with species is consistent with a cumulative effect of differences in diversification rates between branches. [Equal-rates Markov model; imbalance; phylogeny shape; proportional-to-distinguishable-arrangements model.].  相似文献   

16.
Chromosome rearrangements, especially chromosomal deletions, have been exploited as important resources for functional analysis of genomes. To facilitate this analysis, we applied a previously developed method for chromosome splitting for the direct deletion of a designed internal or terminal chromosomal region carrying many nonessential genes in haploid Saccharomyces cerevisiae. The method, polymerase chain reaction (PCR)-mediated chromosomal deletion (PCD), consists of a two-step PCR and one transformation per deletion event. In this paper, we show that the PCD method efficiently deletes internal regions in a single transformation. Of the six chromosomal regions targeted for deletion by this method, five regions (16 to 38 kb in length) containing 10 to 19 nonessential genes were successfully eliminated at high efficiency. The one targeted region on chromosome XIII that was not deleted was subsequently found to contain sequences essential for yeast growth. While 14 individual genes in this region have been reported to be nonessential, synthetic lethal interactions may occur among these nonessential genes. Phenotypic analysis showed that four deletion strains still exhibited normal growth while possible synthetic growth defects were observed in another strain harboring a 19-gene deletion on chromosome XV. These results demonstrate that the PCD method is a useful tool for deleting genes and for analyzing their functions in defined chromosomal regions.  相似文献   

17.
Parasitological research is often contingent on the knowledge of the phylogeny/genealogy of the studied group. Although molecular phylogenetics has proved to be a powerful tool in such investigations, its application in the traditional fashion, based on a tree inference from the primary nucleotide sequences may, in many cases, be insufficient or even improper. These limitations are due to a number of factors, such as a scarcity/ambiguity of phylogenetic information in the sequences, an intricacy of gene relationships at low phylogenetic levels, or a lack of criteria when deciding among several competing coevolutionary scenarios. With respect to the importance of a precise and reliable phylogenetic background in many biological studies, attempts are being made to extend molecular phylogenetics with a variety of new data sources and methodologies. In this review, selected approaches potentially applicable to parasitological research are presented and their advantages as well as drawbacks are discussed. These issues include the usage of idiosyncratic markers (unique features with presumably low probability of homoplasy), such as insertion of mobile elements, gene rearrangements and secondary structure features; the problem of ancestral polymorphism and reticulate relationships at low phylogenetic levels; and the utility of a molecular clock to facilitate discrimination among alternative scenarios in host-parasite coevolution.  相似文献   

18.
When isolated but reproductively compatible populations expand geographically and meet, simulations predict asymmetric introgression of neutral loci from a local to invading taxon. Genetic introgression may affect phylogenetic reconstruction by obscuring topology and divergence estimates. We combined phylogenetic analysis of sequences from one mtDNA and 12 nuDNA loci with analysis of gene flow among 5 species of Pacific Locustella warblers to test for presence of genetic introgression and its effects on tree topology and divergence estimates. Our data showed that nuDNA introgression was substantial and asymmetrical among all members of superspecies groups whereas mtDNA showed no introgression except a single species pair where the invader''s mtDNA was swept by mtDNA of the local species. This introgressive sweep of mtDNA had the opposite direction of the nuDNA introgression and resulted in the paraphyly of the local species'' mtDNA haplotypes with respect to those of the invader. Тhe multilocus nuDNA species tree resolved all inter- and intraspecific relationships despite substantial introgression. However, the node ages on the species tree may be underestimated as suggested by the differences in node age estimates based on non-introgressing mtDNA and introgressing nuDNA. In turn, the introgressive sweep and strong purifying selection appear to elongate internal branches in the mtDNA gene tree.  相似文献   

19.
Based on Hovenkamp’s ideas on historical biogeography, we present a method for analysis of taxon history, spatial analysis of vicariance, which uses observed distributions as data, thus requiring neither predefined areas nor assumptions of hierarchical relations between areas. The method is based on identifying sister nodes with disjunct (allopatric/vicariant) distributions. To do this across the tree, internal nodes are assigned distributions (as the sum of the distributions of the descendant nodes). When distributions are less than ideal, ignoring the distribution of the problematic node(s) when assigning a distribution to their ancestors may allow us to consider additional sister nodes (i.e. those resulting from splits basal to the problematic node) as having disjunct distributions. The optimality criterion seeks to find the best (possibly weighted) compromise between the maximum possible number of disjunct sister nodes and the minimum number of eliminated distributions. The method can also take overlap into account. The methodology presented is implemented in VIP, a computer program available at http://www.zmuc.dk/public/phylogeny/vip . © The Willi Hennig Society 2011.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号