首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We have developed a new method for reconstructing phylogenetic trees called random local neighbor-joining (RLNJ). Our method is different from the neighbor-joining method (NJ) of Saitou and Nei and affords a more thorough sampling of solution space by randomly searching for local pair of neighbors in each step. Results using the RLNJ method to analyze yeast data show an increasing possibility to get a smaller S value (sum of branch lengths) compared with the NJ method as cases with more taxa are analyzed and many individual runs using the RLNJ method usually generate more than one topology with small S values. Computer simulation shows the fact that the RLNJ method can improve the possibility of recovering correct topology significantly by affording more than one topology. In addition, when using the RLNJ method, computer simulation also shows that the proportion of correct topologies (P(C)) will increase as the number of different topologies decreases and as the proportion of "most frequent topology" increases. Thus, the number of different topologies and the proportion of "most frequent topology" can be used as auxiliary criteria to evaluate reliability of a phylogenetic tree.  相似文献   

2.
The computationally challenging problem of reconstructing the phylogeny of a set of contemporary data, such as DNA sequences or morphological attributes, was treated by an extended version of the neighbor-joining (NJ) algorithm. The original NJ algorithm provides a single-tree topology, after a cascade of greedy pairing decisions that tries to simultaneously optimize the minimum evolution and the least squares criteria. Given that some sub-trees are more stable than others, and that the minimum evolution tree may not be achieved by the original NJ algorithm, we propose a multi-neighbor-joining (MNJ) algorithm capable of performing multiple pairing decisions at each level of the tree reconstruction, keeping various partial solutions along the recursive execution of the NJ algorithm. The main advantages of the new reconstruction procedure are: 1) as is the case for the original NJ algorithm, the MNJ algorithm is still a low-cost reconstruction method; 2) a further investigation of the alternative topologies may reveal stable and unstable sub-trees; 3) the chance of achieving the minimum evolution tree is greater; 4) tree topologies with very similar performances will be simultaneously presented at the output. When there are multiple unrooted tree topologies to be compared, a visualization tool is also proposed, using a radial layout to uniformly distribute the branches with the help of well-known metaheuristics used in computer science.  相似文献   

3.
A stepwise algorithm for finding minimum evolution trees   总被引:7,自引:6,他引:1  
A stepwise algorithm for reconstructing minimum evolution (ME) trees from evolutionary distance data is proposed. In each step, a taxon that potentially has a neighbor (another taxon connected to it with a single interior node) is first chosen and then its true neighbor searched iteratively. For m taxa, at most (m-1)!/2 trees are examined and the tree with the minimum sum of branch lengths (S) is chosen as the final tree. This algorithm provides simple strategies for restricting the tree space searched and allows us to implement efficient ways of dynamically computing the ordinary least squares estimates of S for the topologies examined. Using computer simulation, we found that the efficiency of the ME method in recovering the correct tree is similar to that of the neighbor-joining method (Saitou and Nei 1987). A more exhaustive search is unlikely to improve the efficiency of the ME method in finding the correct tree because the correct tree is almost always included in the tree space searched with this stepwise algorithm. The new algorithm finds trees for which S values may not be significantly different from that of the ME tree if the correct tree contains very small interior branches or if the pairwise distance estimates have large sampling errors. These topologies form a set of plausible alternatives to the ME tree and can be compared with each other using statistical tests based on the minimum evolution principle. The new algorithm makes it possible to use the ME method for large data sets.   相似文献   

4.
The minimum-evolution (ME) method of phylogenetic inference is based on the assumption that the tree with the smallest sum of branch length estimates is most likely to be the true one. In the past this assumption has been used without mathematical proof. Here we present the theoretical basis of this method by showing that the expectation of the sum of branch length estimates for the true tree is smallest among all possible trees, provided that the evolutionary distances used are statistically unbiased and that the branch lengths are estimated by the ordinary least-squares method. We also present simple mathematical formulas for computing branch length estimates and their standard errors for any unrooted bifurcating tree, with the least-squares approach. As a numerical example, we have analyzed mtDNA sequence data obtained by Vigilant et al. and have found the ME tree for 95 human and 1 chimpanzee (outgroup) sequences. The tree was somewhat different from the neighbor-joining tree constructed by Tamura and Nei, but there was no statistically significant difference between them.   相似文献   

5.
In phylogenetic inference by maximum-parsimony (MP), minimum-evolution (ME), and maximum-likelihood (ML) methods, it is customary to conduct extensive heuristic searches of MP, ME, and ML trees, examining a large number of different topologies. However, these extensive searches tend to give incorrect tree topologies. Here we show by extensive computer simulation that when the number of nucleotide sequences (m) is large and the number of nucleotides used (n) is relatively small, the simple MP or ML tree search algorithms such as the stepwise addition (SA) plus nearest neighbor interchange (NNI) search and the SA plus subtree pruning regrafting (SPR) search are as efficient as the extensive search algorithms such as the SA plus tree bisection-reconnection (TBR) search in inferring the true tree. In the case of ME methods, the simple neighbor-joining (NJ) algorithm is as efficient as or more efficient than the extensive NJ+TBR search. We show that when ME methods are used, the simple p distance generally gives better results in phylogenetic inference than more complicated distance measures such as the Hasegawa-Kishino-Yano (HKY) distance, even when nucleotide substitution follows the HKY model. When ML methods are used, the simple Jukes-Cantor (JC) model of phylogenetic inference generally shows a better performance than the HKY model even if the likelihood value for the HKY model is much higher than that for the JC model. This indicates that at least in the present case, selecting of a substitution model by using the likelihood ratio test or the AIC index is not appropriate. When n is small relative to m and the extent of sequence divergence is high, the NJ method with p distance often shows a better performance than ML methods with the JC model. However, when the level of sequence divergence is low, this is not the case.  相似文献   

6.
If a gene tree is to be judiciously used for inferring the histories of closely related taxa, (1) its topology must be sufficiently resolved and robust that noteworthy phylogenetic patterns can be confidently documented, and (2) sampling of species, populations, and pertinent biological variation must be sufficiently broad that otherwise misleading sources of genetic variation can be detected. These principles are illustrated by the complex gene tree of Neochlamisus leaf beetles that I reconstructed using 90,000 bp of cytochrome oxidase I (COI) and 16S mitochondrial DNA (mtDNA) sequences from over 100 specimens. Cytochrome oxidase I haplotypes varied up to 25.1% within Neochlamisus and up to 11.1% within the gibbosus species group, while exhibiting very low A + T bias for insect mtDNA (63%), low transition saturation, and conservative patterns of amino acid variation. 16S exhibited lower sequence divergences and greater A + T bias and transition saturation than COI, and substitutions were more constrained in stems than in loops. Comparisons with an earlier study of Ophraella leaf beetles highlighted conservative and labile elements of molecular evolution across genes and taxa. Cytochrome oxidase I parsimony and neighbor-joining analyses strongly supported a robust mtDNA genealogy that revealed the monophyly of Neochlamisus and of the gibbosus species group. Phylogeographic relationships suggested that the eastern U.S. gibbosus group derives from southwestern velutinus group ancestors. Haplotypes from individual velutinus group species clustered monophyletically, as expected. However, haplotypes from each of several gibbosus group taxa were polyphyletically distributed, appearing in divergent parts of the tree. 16S provided a less-resolved gibbosus group topology that was congruent with the COI tree and corroborated patterns of mitochondrial polyphyly. By subsampling haplotypes corresponding to particular species, populations, and ecological variants of gibbosus group taxa, I demonstrate that recovered topologies and genetic distances vary egregiously according to sampling regime. This study thus documents the potentially dire consequences of inadequate sampling when inferring the evolutionary history of closely related and mitochondrially polyphyletic taxa.  相似文献   

7.
In the reconstruction of a large phylogenetic tree, the most difficult part is usually the problem of how to explore the topology space to find the optimal topology. We have developed a "divide-and-conquer" heuristic algorithm in which an initial neighbor-joining (NJ) tree is divided into subtrees at internal branches having bootstrap values higher than a threshold. The topology search is then conducted by using the maximum-likelihood method to reevaluate all branches with a bootstrap value lower than the threshold while keeping the other branches intact. Extensive simulation showed that our simple method, the neighbor-joining maximum-likelihood (NJML) method, is highly efficient in improving NJ trees. Furthermore, the performance of the NJML method is nearly equal to or better than existing time-consuming heuristic maximum-likelihood methods. Our method is suitable for reconstructing relatively large molecular phylogenetic trees (number of taxa >/= 16).  相似文献   

8.
SUMMARY: LumberJack is a phylogenetic tool intended to serve two purposes: to facilitate sampling treespace to find likely tree topologies quickly, and to map phylogenetic signal onto regions of an alignment in a revealing way. LumberJack creates non-random jackknifed alignments by progressively sliding a window of omission along the alignment. A neighbor-joining tree is built from the full alignment and from each jackknifed alignment, and then the likelihood for each topology (given the original full alignment) is calculated. To determine whether any of the topologies generated is significantly more likely than the others, Kishino-Hasegawa, Shimodaira-Hasegawa and ELW tests are implemented. Availability and SUPPLEMENTARY INFORMATION: http://www.plantbio.uga.edu/~russell/software.html  相似文献   

9.
Abstract — Morphological characters from sabethine mosquitoes were coded from larvae, pupae and adults, and life-stage partitions were evaluated to determine the contribution of each to the topology of a combined cladogram. Initial tests failed to find congruence between characters partitioned by life stage. However, when components from the combined analysis were tested using reduced taxon sets, a high degree of concordance between partitions was observed. A procedure for assessing individual life-stage contribution is employed, in which exhaustive searches are used to explore all possible arrangements for each of the selected components. Seven of the 10 components examined were able to recover the combined topology with a reduced taxon set. Congruent arrangements of taxa were typically observed for two or more life stages, although partitioned data were less resolved and frequently included aberrant topologies (those not supported by other partitioned or combined reduced taxon tree sets). In addition, none of the partitioned data sets gave robust results for all tests, suggesting that studies which emphasize character data from single life stages may support misleading arrangements of taxa. One component on the combined cladogram was not supported by any of the life-stage partitions when analysed separately. These results are complementary to “total evidence” approach, and demonstrate that partitions of data are useful for examining suits of characters which may cause some components of the “total  相似文献   

10.
D-H Kim  D Heber  D W Still 《Génome》2004,47(1):102-111
The taxonomy of Echinacea is based on morphological characters and has varied depending on the monographer. The genus consists of either nine species and four varieties or four species and eight varieties. We have used amplified fragment length polymorphisms (AFLP) to assess genetic diversity and phenetic relationships among nine species and three varieties of Echinacea (sensu McGregor). A total of 1086 fragments, of which approximately 90% were polymorphic among Echinacea taxa, were generated from six primer combinations. Nei and Li's genetic distance coefficient and the neighbor-joining algorithm were employed to construct a phenetic tree. Genetic distance results indicate that all Echinacea species are closely related, and the average pairwise distance between populations was approximately three times the intrapopulation distances. The topology of the neighbor-joining tree strongly supports two major clades, one containing Echinacea purpurea, Echinacea sanguinea, and Echinacea simulata and the other containing the remainder of the Echinacea taxa (sensu McGregor). The species composition within the clades differs between our AFLP data and the morphometric treatment offered by Binns and colleagues. We also discuss the suitability of AFLP in determining phylogenetic relationships.  相似文献   

11.
A set of experiments based on simulation and analysis found that using the parsimony algorithm for ancestral state estimation can benefit from increased sampling of terminal taxa. Estimation at the base of small clades showed strong sensitivity to tree topology and number of descendent tips. These effects were largely driven by the creation and negation of ambiguity across a topology. Root state and internal state estimation showed similar behavior. We conclude that increased taxon sampling density is generally advisable, and attention to topological effects may be advisable in evaluating the confidence placed in state estimation. We also explore the factors affecting ancestral state estimation and conjecture that as taxa are added to a tree, the total amount of information for root state estimation depends on the tree topology and distance to root state of added taxa. For a pure-birth model tree, we conjecture that the addition of N taxa increases root state information in proportion to log(N).  相似文献   

12.
Yu Y  Degnan JH  Nakhleh L 《PLoS genetics》2012,8(4):e1002660
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa.  相似文献   

13.
Interrelationships of the tapeworms (Platyhelminthes: Cestoda) were examined by use of small (SSU) and large (LSU) subunit ribosomal DNA sequences and morphological characters. Fifty new complete SSU sequences were added to 21 sequences previously determined, and 71 new LSU (D1-D3) sequences were determined for the complementary set of taxa representing each of the major lineages of cestodes as currently understood. New sequences were determined for three amphilinidean taxa, but were removed from both alignments due to their excessively high degree of divergence from other cestode sequences. A morphological character matrix coded for supraspecific taxa was constructed by the modification of matrices from recently published studies. Maximum-parsimony (MP) analyses were performed on the LSU, SSU, LSU+SSU, and morphological data partitions, and minimum-evolution (ME) analyses utilizing a general time reversible model of nucleotide substitution including estimates of among-site rate heterogeneity were performed on the molecular data partitions. Resulting topologies were rooted at the node separating the Gyrocotylidea from the Eucestoda. The LSU data were found to be more informative than the SSU data and were more consistent with inferences from morphology, although nodal support was generally weak for most basal nodes. One class of transitions was found to be saturated for comparisons between the most distantly related taxa (gyrocotylideans vs cyclophyllideans and tetrabothriideans). Differences in the topologies resulting from MP and ME analyses were not statistically significant. Nonstrobilate orders formed the basal lineages of trees resulting from analysis of LSU data and morphology. Difossate orders were basal to tetrafossate orders, the latter of which formed a strongly supported clade. A clade including the orders Cyclophyllidea, Nippotaeniidea, and Tetrabothriidea was supported by all data partitions and methods of analysis. Paraphyly of the orders Pseudophyllidea, Tetraphyllidea, and Trypanorhyncha was consistent among the molecular data partitions. Inferences are made regarding a monozoic (nonsegmented) origin of the Eucestoda as represented by the Caryophyllidea and for the evolution of the strobilate and acetabulate/tetrafossate conditions having evolved in a stepwise pattern.  相似文献   

14.
The concordance of gene trees and species trees is reconsidered in detail, allowing for samples of arbitrary size to be taken from the species. A sense of concordance for gene tree and species tree topologies is clarified, such that if the "collapsed gene tree" produced by a gene tree has the same topology as the species tree, the gene tree is said to be topologically concordant with the species tree. The term speciodendric is introduced to refer to genes whose trees are topologically concordant with species trees. For a given three-species topology, probabilities of each of the three possible collapsed gene tree topologies are given, as are probabilities of monophyletic concordance and concordance in the sense of N. Takahata (1989), Genetics 122, 957-966. Increasing the sample size is found to increase the probability of topological concordance, but a limit exists on how much the topological concordance probability can be increased. Suggested sample sizes beyond which this probability can be increased only minimally are given. The results are discussed in terms of implications for molecular studies of phylogenetics and speciation.  相似文献   

15.
The phylogenetic positioning of the non-pathogenic genusSpiromastix in the Onygenales was studied based on large subunit rDNA (LSU rDNA) partial sequences (ca. 570 bp.). FourSpiromastix species and 28 representative taxa of the Onygenales were newly sequenced. Phylogenetic trees were constructed by the neighbor-joining (NJ) method and evaluated by the maximum parsimony (MP) method with the data of 13 taxa retrieved from DNA databases.Spiromastix and dimorphic systemic pathogens,Ajellomyces andParacoccidioides, appear to be a monophyletic group with 74% bootstrap probability (BP) in the NJ tree constructed with the representative taxa of the Onygenales. The tree topology was concordant with the NJ tree based on SSU rDNA sequences of our previous work and corresponded to the classification system of the Onygenales by Currah (1985) and its minor modification by Udagawa (1997) with the exception of the classification of the Onygenaceae. The Onygeneceae sensu Udagawa may still be polyphyletic, since three independent lineages were recognized. The taxa forming helicoid peridial appendages were localized to two clades on the tree. The topology of the NJ tree constructed withSpiromastix and its close relatives suggested that the helicoid peridial appendages were apomorphic and acquired independently in the two clades of the Onygenales.  相似文献   

16.
The bootstrapping method of determining confidence in the topology of phylogenetic trees has been applied to electrophoretic protein data for two groups of amphibians: salamanders of two North American genera (Aneides and Plethodon) of the tribe Plethodontini and Holarctic hylid frogs. Some current methods of phylogenetic reconstruction for electrophoretic protein data have been evaluated by comparing the trees obtained from molecular data sets with available morphological data. Molecular data on the phylogenetic relationships of Aneides and Plethodon, data obtained from electrophoretic and immunological studies, indicate that Aneides probably was derived from western Plethodon subsequent to the separation of eastern and western Plethodon. Thus Plethodon very likely is a paraphyletic genus. The extremely low rate of morphological evolution in Plethodon compared with that in Aneides causes difficulty in indicating their evolutionary relationships taxonomically because there are no synapomorphic morphological characters that define either eastern or western Plethodon, whereas there are several for the genus Aneides. Thus molecular data alone probably indicate the evolutionary relationships of the species in these genera. Highton and Larson's (1979) arrangement of species of Plethodon into eight species groups is supported. The topologies of the unweighted pair-group method using arithmetic means (UPGMA) and distance Wagner trees were compared with independent morphological and molecular data on the relationships of the 28 plethodonine species. It was found that UPGMA trees indicate relationships that are more in agreement with other information than are those provided by distance Wagner trees. The use of the bootstrap technique indicates that the topologies of UPGMA trees are better supported statistically than are the topologies of distance Wagner trees. Moreover, different addition criteria produce a variety of distance Wagner trees with different topologies, each with several groupings that are not supported statistically. It is concluded that considerable caution should be used in interpreting the topology of distance Wagner trees. Very similar results were obtained with a second data set on 30 taxa of Holarctic hylid frogs. Trees obtained by the neighbor-joining method are more in agreement with UPGMA phenograms and other data, so this method of phylogenetic reconstruction may be useful to systematists not willing to assume constant rates of evolution.(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

17.
Characters derived from advertisement calls, morphology, allozymes, and the sequences of the small subunit of the mitochondrial ribosomal gene (12S) and the cytochrome oxidase I (COI) mitochondrial gene were used to estimate the phylogeny of frogs of the Physalaemus pustulosus group (Leptodactylidae). The combinability of these data partitions was assessed in several ways: measures of phylogenetic signal, character support for trees, congruence of tree topologies, compatibility of data partitions with suboptimal trees, and homogeneity of data partitions. Combined parsimony analysis of all data equally weighted yielded the same tree as the 12S partition analyzed under parsimony and maximum likelihood. The COI, allozyme, and morphology partitions were generally congruent and compatible with the tree derived from combined data. The call data were significantly different from all other partitions, whether considered in terms of tree topology alone, partition homogeneity, or compatibility of data with trees derived from other partitions. The lack of effect of the call data on the topology of the combined tree is probably due to the small number of call characters. The general incongruence of the call data with other data partitions is consistent with the idea that the advertisement calls of this group of frogs are under strong sexual selection.  相似文献   

18.
In the field of phylogenetics and comparative genomics, it is important to establish orthologous relationships when comparing homologous sequences. Due to the slight sequence dissimilarity between orthologs and paralogs, it is prone to regarding paralogs as orthologs. For this reason, several methods based on evolutionary distance, phylogeny and BLAST have tried to detect orthologs with more precision. Depending on their algorithmic implementations, each of these methods sometimes has increased false negative or false positive rates. Here, we developed a novel algorithm for orthology detection that uses a distance method based on the phylogenetic criterion of minimum evolution. Our algorithm assumes that sets of sequences exhibiting orthologous relationships are evolutionarily less costly than sets that include one or more paralogous relationships. Calculation of evolutionary cost requires the reconstruction of a neighbor-joining (NJ) tree, but calculations are unaffected by the topology of any given NJ tree. Unlike tree reconciliation, our algorithm appears free from the problem of incorrect topologies of species and gene trees. The reliability of the algorithm was tested in a comparative analysis with two other orthology detection methods using 95 manually curated KOG datasets and 21 experimentally verified EXProt datasets. Sensitivity and specificity estimates indicate that the concept of minimum evolution could be valuable for the detection of orthologs.  相似文献   

19.
Phylogeny reconstruction is a difficult computational problem, because the number of possible solutions increases with the number of included taxa. For example, for only 14 taxa, there are more than seven trillion possible unrooted phylogenetic trees. For this reason, phylogenetic inference methods commonly use clustering algorithms (e.g., the neighbor-joining method) or heuristic search strategies to minimize the amount of time spent evaluating nonoptimal trees. Even heuristic searches can be painfully slow, especially when computationally intensive optimality criteria such as maximum likelihood are used. I describe here a different approach to heuristic searching (using a genetic algorithm) that can tremendously reduce the time required for maximum-likelihood phylogenetic inference, especially for data sets involving large numbers of taxa. Genetic algorithms are simulations of natural selection in which individuals are encoded solutions to the problem of interest. Here, labeled phylogenetic trees are the individuals, and differential reproduction is effected by allowing the number of offspring produced by each individual to be proportional to that individual's rank likelihood score. Natural selection increases the average likelihood in the evolving population of phylogenetic trees, and the genetic algorithm is allowed to proceed until the likelihood of the best individual ceases to improve over time. An example is presented involving rbcL sequence data for 55 taxa of green plants. The genetic algorithm described here required only 6% of the computational effort required by a conventional heuristic search using tree bisection/reconnection (TBR) branch swapping to obtain the same maximum-likelihood topology.   相似文献   

20.
Determining the phylogenetic relationships among the major lines of angiosperms is a long-standing problem, yet the uncertainty as to the phylogenetic affinity of these lines persists. While a number of studies have suggested that the ANITA (Amborella-Nymphaeales-Illiciales-Trimeniales-Aristolochiales) grade is basal within angiosperms, studies of complete chloroplast genome sequences also suggested an alternative tree, wherein the line leading to the grasses branches first among the angiosperms. To improve taxon sampling in the existing chloroplast genome data, we sequenced the chloroplast genome of the monocot Acorus calamus. We generated a concatenated alignment (89,436 positions for 15 taxa), encompassing almost all sequences usable for phylogeny reconstruction within spermatophytes. The data still contain support for both the ANITA-basal and grasses-basal hypotheses. Using simulations we can show that were the ANITA-basal hypothesis true, parsimony (and distance-based methods with many models) would be expected to fail to recover it. The self-evident explanation for this failure appears to be a long-branch attraction (LBA) between the clade of grasses and the out-group. However, this LBA cannot explain the discrepancies observed between tree topology recovered using the maximum likelihood (ML) method and the topologies recovered using the parsimony and distance-based methods when grasses are deleted. Furthermore, the fact that neither maximum parsimony nor distance methods consistently recover the ML tree, when according to the simulations they would be expected to, when the out-group (Pinus) is deleted, suggests that either the generating tree is not correct or the best symmetric model is misspecified (or both). We demonstrate that the tree recovered under ML is extremely sensitive to model specification and that the best symmetric model is misspecified. Hence, we remain agnostic regarding phylogenetic relationships among basal angiosperm lineages.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号