首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The codon-degeneracy model (CDM) predicts relative frequencies of substitution for any set of homologous protein-coding DNA sequences based on patterns of nucleotide degeneracy, codon composition, and the assumption of selective neutrality. However, at present, the CDM is reliant on outside estimates of transition bias. A new method by which the power of the CDM can be used to find a synonymous transition bias that is optimal for any given phylogenetic tree topology is presented. An example is illustrated that utilizes optimized transition biases to generate CDM GF-scores for every possible phylogenetic tree for pocket gophers of the genus Orthogeomys. The resulting distribution of CDM GF-scores is compared and contrasted with the results of maximum parsimony and maximum likelihood methods. Although convergence on a single tree topology by the CDM and another method indicates greater support for that particular tree, the value of CDM GF-score as the sole optimality criterion for phylogeny reconstruction remains to be determined. It is clear, however, that the a priori estimation of an optimum transition bias from codon composition has a direct application to differentiating between alternative trees. Received: 13 October 1999 / Accepted: 28 April 2000  相似文献   

2.
Phylogenetic tree reconstruction requires construction of a multiple sequence alignment (MSA) from sequences. Computationally, it is difficult to achieve an optimal MSA for many sequences. Moreover, even if an optimal MSA is obtained, it may not be the true MSA that reflects the evolutionary history of the underlying sequences. Therefore, errors can be introduced during MSA construction which in turn affects the subsequent phylogenetic tree construction. In order to circumvent this issue, we extend the application of the k-tuple distance to phylogenetic tree reconstruction. The k-tuple distance between two sequences is the sum of the differences in frequency, over all possible tuples of length k, between the sequences and can be estimated without MSAs. It has been traditionally used to build a fast ‘guide tree’ to assist the construction of MSAs. Using the 1470 simulated sets of sequences generated under different evolutionary scenarios, the neighbor-joining trees and BioNJ trees, we compared the performance of the k-tuple distance with four commonly used distance estimators including Jukes–Cantor, Kimura, F84 and Tamura–Nei. These four distance estimators fall into the category of model-based distance estimators, as each of them takes account of a specific substitution model in order to compute the distance between a pair of already aligned sequences. Results show that trees constructed from the k-tuple distance are more accurate than those from other distances most time; when the divergence between underlying sequences is high, the tree accuracy could be twice or higher using the k-tuple distance than other estimators. Furthermore, as the k-tuple distance voids the need for constructing an MSA, it can save tremendous amount of time for phylogenetic tree reconstructions when the data include a large number of sequences.  相似文献   

3.
Zuckerkandl and Pauling (1962, "Horizons in Biochemistry," pp. 189-225, Academic Press, New York) first noticed that the degree of sequence similarity between the proteins of different species could be used to estimate their phylogenetic relationship. Since then models have been developed to improve the accuracy of phylogenetic inferences based on amino acid or DNA sequences. Most of these models were designed to yield distance measures that are linear with time, on average. The reliability of phylogenetic reconstruction, however, depends on the variance of the distance measure in addition to its expectation. In this paper we show how the method of generalized least squares can be used to combine data types, each most informative at different points in time, into a single distance measure. This measure reconstructs phylogenies more accurately than existing non-likelihood distance measures. We illustrate the approach for a two-rate mutation model and demonstrate that its application provides more accurate phylogenetic reconstruction than do currently available analytical distance measures.  相似文献   

4.
Do phylogenies and branch lengths based on mitochondrial DNA (mtDNA) provide a reasonable approximation to those based on multiple nuclear loci? In the present study, we show widespread discordance between phylogenies based on mtDNA (two genes) and nuclear DNA (nucDNA; six loci) in a phylogenetic analysis of the turtle family Emydidae. We also find an unusual type of discordance involving the unexpected homogeneity of mtDNA sequences across species within genera. Of the 36 clades in the combined nucDNA phylogeny, 24 are contradicted by the mtDNA phylogeny, and six are strongly contested by each data set. Two genera (Graptemys, Pseudemys) show remarkably low mtDNA divergence among species, whereas the combined nuclear data show deep divergences and (for Pseudemys) strongly supported clades. These latter results suggest that the mitochondrial data alone are highly misleading about the rate of speciation in these genera and also about the species status of endangered Graptemys and Pseudemys species. In addition, despite a strongly supported phylogeny from the combined nuclear genes, we find extensive discordance between this tree and individual nuclear gene trees. Overall, the results obtained illustrate the potential dangers of making inferences about phylogeny, speciation, divergence times, and conservation from mtDNA data alone (or even from single nuclear genes), and suggest the benefits of using large numbers of unlinked nuclear loci. © 2010 The Linnean Society of London, Biological Journal of the Linnean Society, 2010, 99 , 445–461.  相似文献   

5.
Ancestral state reconstruction is a method used to study the evolutionary trajectories of quantitative characters on phylogenies. Although efficient methods for univariate ancestral state reconstruction under a Brownian motion model have been described for at least 25 years, to date no generalization has been described to allow more complex evolutionary models, such as multivariate trait evolution, non‐Brownian models, missing data, and within‐species variation. Furthermore, even for simple univariate Brownian motion models, most phylogenetic comparative R packages compute ancestral states via inefficient tree rerooting and full tree traversals at each tree node, making ancestral state reconstruction extremely time‐consuming for large phylogenies. Here, a computationally efficient method for fast maximum likelihood ancestral state reconstruction of continuous characters is described. The algorithm has linear complexity relative to the number of species and outperforms the fastest existing R implementations by several orders of magnitude. The described algorithm is capable of performing ancestral state reconstruction on a 1,000,000‐species phylogeny in fewer than 2 s using a standard laptop, whereas the next fastest R implementation would take several days to complete. The method is generalizable to more complex evolutionary models, such as phylogenetic regression, within‐species variation, non‐Brownian evolutionary models, and multivariate trait evolution. Because this method enables fast repeated computations on phylogenies of virtually any size, implementation of the described algorithm can drastically alleviate the computational burden of many otherwise prohibitively time‐consuming tasks requiring reconstruction of ancestral states, such as phylogenetic imputation of missing data, bootstrapping procedures, Expectation‐Maximization algorithms, and Bayesian estimation. The described ancestral state reconstruction algorithm is implemented in the Rphylopars functions anc.recon and phylopars.  相似文献   

6.
Considerable progress has been made recently in phylogenetic reconstruction in a number of groups of organisms. This progress coincides with two major advances in systematics: new sources have been found for potentially informative characters (i. e., molecular data) and (more importantly) new approaches have been developed for extracting historical information from old or new characters (i. e., Hennigian phylogenetic systematics or cladistics). The basic assumptions of cladistics (the existence and splitting of lineages marked by discrete, heritable, and independent characters, transformation of which occurs at a rate slower than divergence of lineages) are discussed and defended. Molecular characters are potentially greater in quantity than (and usually independent of) more traditional morphological characters, yet their great simplicity (i. e., fewer potential character states; problems with determining homology), and difficulty of sufficient sampling (particularly from fossils) can lead to special difficulties. Expectations of the phylogenetic behavior of different types of data are investigated from a theoretical standpoint, based primarily on variation in the central parameter λ (branch length in terms of expected number of character changes per segment of a tree), which also leads to possibilities for character and character state weighting. Also considered are prospects for representing diverse yet clearly monophyletic clades in larger-scale cladistic analyses, e. g., the exemplar method vs. “compartmentalization” (a new approach involving substituting an inferred “archetype” for a large clade accepted as monophyletic based on previous analyses). It is concluded that parsimony is to be preferred for synthetic, “total evidence” analyses because it appears to be a robust method, is applicable to all types of data, and has an explicit and interpretable evolutionary basis. © 1994 Wiley-Liss, Inc.  相似文献   

7.
A new phylogenetic comparative method is proposed, based on mapping two continuous characters on a tree to generate data pairs for regression or correlation analysis, which resolves problems of multiple character reconstructions, phylogenetic dependence, and asynchronous responses (evolutionary lags). Data pairs are formed in two ways (tree‐down and tree‐up) by matching corresponding changes, Δx and Δy. Delayed responses (Δy occurring later in the tree than Δx) are penalized by weighting pairs using nodal or branch‐length distance between Δx and Δy; immediate (same‐node) responses are given maximum weight. All combinations of character reconstructions (or a random sample thereof) are used to find the observed range of the weighted coefficient of correlation r (or weighted slope b). This range is used as test statistic, and the null distribution is generated by randomly reallocating changes (Δx and Δy) in the topology. Unlike randomization of terminal values, this procedure complies with Generalized Monte Carlo requirements while saving considerable computation time. Phylogenetic dependence is avoided by randomization without data transformations, yielding acceptable type‐I error rates and statistical power. We show that ignoring delayed responses can lead to falsely nonsignificant results. Issues that arise from considering delayed responses based on optimization are discussed.  相似文献   

8.
9.
Phylogenetic inference under the pure drift model   总被引:1,自引:1,他引:0  
When pairwise genetic distances are used for phylogenetic reconstruction, it is usually assumed that the genetic distance between two taxa contains information about the time after the two taxa diverged. As a result, upon an appropriate transformation if necessary, the distance usually can be fitted to a linear model such that it is expressed as the sum of lengths of all branches that connect the two taxa in a given phylogeny. This kind of distance is referred to as "additive distance." For a phylogenetic tree exclusively driven by random genetic drift, genetic distances related to coancestry coefficients (theta XY) between any two taxa are more suitable. However, these distances are fundamentally different from the additive distance in that coancestry does not contain any information about the time after two taxa split from a common ancestral population; instead, it reflects the time before the two taxa diverged. In other words, the magnitude of theta XY provides information about how long the two taxa share the same evolutionary pathways. The fundamental difference between the two kinds of distances has led to a different algorithm of evaluating phylogenetic trees when theta XY and related distance measures are used. Here we present the new algorithm using the ordinary- least-squares approach but fitting to a different linear model. This treatment allows genetic variation within a taxon to be included in the model. Monte Carlo simulation for a rooted phylogeny of four taxa has verified the efficacy and consistency of the new method. Application of the method to human population was demonstrated.   相似文献   

10.
The development and application of an assay method for papaverine in whole blood is reported. A single, simple extraction procedure at pH 10.0 using chloroform—n-hexane (2:3) as the solvent, results in pure extracts which can be chromatographed without further purification. Chromatography is performed on a nitrile-bonded phase, using n-hexane—dichloromethane—acetonitrile—propylamine (50:25:25:0.1) as mobile phase. This method is characterized by a between-day precision of 4% at the 200 ng/ml level and a detection limit of 5 ng/ml, and was successfully applied in a pharmacokinetic study.  相似文献   

11.
T-REX (tree and reticulogram reconstruction) is an application to reconstruct phylogenetic trees and reticulation networks from distance matrices. The application includes a number of tree fitting methods like NJ, UNJ or ADDTREE which have been very popular in phylogenetic analysis. At the same time, the software comprises several new methods of phylogenetic analysis such as: tree reconstruction using weights, tree inference from incomplete distance matrices or modeling a reticulation network for a collection of objects or species. T-REX also allows the user to visualize obtained tree or network structures using Hierarchical, Radial or Axial types of tree drawing and manipulate them interactively. AVAILABILITY: T-REX is a freeware package available online at: http://www.fas.umontreal.ca/biol/casgrain/en/labo/t-rex  相似文献   

12.
A comparison of ribosomal internal transcribed spacer 1 (ITS1) elements of digenetic trematodes (Platyhelminthes) including unidentified digeneans isolated from Cyathura carinata (Crustacea: Isopoda) revealed DNA sequence similarities at more than half of the spacer at its 3′ end. Primary sequence similarity was shown to be associated with secondary structure conservation, which suggested that similarity is due to identity by descent and not chance. Using an analysis of apomorphies, the sequence data were shown to produce a distinct phylogenetic signal. This was confirmed by the consistency of results of different tree reconstruction methods such as distance approaches, maximum parsimony, and maximum likelihood. Morphological evidence additionally supported the phylogenetic tree based on ITS1 data and the inferred phylogenetic position of the unidentified digeneans of C. carinata met the expectations from known trematode life-cycle patterns. Although ribosomal ITS1 elements are generally believed to be too variable for phylogenetic analysis above the species or genus level, the overall consistency of the results of this study strongly suggests that this is not the case in digenetic trematodes. Here, 3′ end ITS1 sequence data seem to provide a valuable tool for elucidating phylogenetic relationships of a broad range of phylogenetically distinct taxa. Received: 20 October 1997 / Accepted: 24 March 1998  相似文献   

13.
Various factors, including taxon density, sampling error, convergence, and heterogeneity of evolutionary rates, can potentially lead to incongruence between phylogenetic trees based on different genomes. Particularly at the generic level and below, chloroplast capture resulting from hybridization may distort organismal relationships in phylogenetic analyses based on the chloroplast genome, or genes included therein. However, the extent of such discord between chloroplast DNA (cpDNA) trees and those trees based on nuclear genes has rarely been assessed. We therefore used sequences of the internal transcribed spacer regions (ITS-1 and ITS-2) of nuclear ribosomal DNA (rDNA) to reconstruct phylogenetic relationships among members of the Heuchera group of genera (Saxifragaceae). The Heuchera group presents an important model for the analysis of chloroplast capture and its impact on phylogenetic reconstruction because hybridization is well documented within genera (e.g., Heuchera), and intergeneric hybrids involving six of the nine genera have been reported. An earlier study provided a well-resolved phylogenetic hypothesis for the Heuchera group based on cpDNA restriction-site variation. However, trees based on ITS sequences are discordant with the cpDNA-based tree. Evidence from both morphology and nuclear-encoded allozymes is consistent with the ITS trees, rather than the cpDNA tree, and several points of phylogenetic discord can clearly be attributed to chloroplast capture. Comparison of the organellar and ITS trees also raises the strong likelihood that ancient events of chloroplast capture occurred between lineages during the early diversification of the Heuchera group. Thus, despite the many advantages and widespread use of cpDNA data in phylogeny reconstruction, comparison of relationships based on cpDNA and ITS sequences for the Heuchera group underscores the need for caution in the use of organellar variation for retrieving phylogeny at lower taxonomic levels, particularly in groups noted for hybridization.  相似文献   

14.
Sequences from nuclear mitochondrial pseudogenes (numts) that originated by transfer of genetic information from mitochondria to the nucleus offer a unique opportunity to compare different regimes of molecular evolution. Analyzing a 1621-nt-long numt of the rRNA specifying mitochondrial DNA residing on human chromosome 3 and its corresponding mitochondrial gene in 18 anthropoid primates, we were able to retrace about 40 MY of primate rDNA evolutionary history. The results illustrate strengths and weaknesses of mtDNA data sets in reconstructing and dating the phylogenetic history of primates. We were able to show the following. In contrast to numt-DNA, (1) the nucleotide composition of mtDNA changed dramatically in the different primate lineages. This is assumed to lead to significant misinterpretations of the mitochondrial evolutionary history. (2) Due to the nucleotide compositional plasticity of primate mtDNA, the phylogenetic reconstruction combining mitochondrial and nuclear sequences is unlikely to yield reliable information for either tree topologies or branch lengths. This is because a major part of the underlying sequence evolution model — the nucleotide composition — is undergoing dramatic change in different mitochondrial lineages. We propose that this problem is also expressed in the occasional unexpected long branches leading to the “common ancestor” of orthologous numt sequences of different primate taxa. (3) The heterogeneous and lineage-specific evolution of mitochondrial sequences in primates renders molecular dating based on primate mtDNA problematic, whereas the numt sequences provide a much more reliable base for dating.[Reviewing Editor: Dr. Rafael Zardoya]  相似文献   

15.
Phylogenetic analysis is currently used worldwide for taxonomic classification and identification of microorganisms. However, despite the countless trees that have been reconstructed and published in recent decades, so far, no user-friendly compilation of recommendations to standardize the data analysis and tree reconstruction process has been published. Consequently, this standard operating procedure for phylogenetic inference (SOPPI) offers a helping hand for working through the process from sampling in the field to phylogenetic tree reconstruction and publication. It is not meant to be authoritative or comprehensive, but should help to make phylogenetic inference and diversity analysis more reliable and comparable between different laboratories. It is mainly focused on using the ribosomal RNA as a universal phylogenetic marker, but the principles and recommendations can be applied to any valid marker gene. Feedback and suggestions from the scientific community are welcome in order to improve these guidelines further. Any updates will be made available on the SILVA webpage at http://www.arb-silva.de/projects/soppi.  相似文献   

16.
Summary The existence of two families of genes coding for hexameric glutamate dehydrogenases has been deduced from the alignment of 21 primary sequences and the determination of the percentages of similarity between each pair of proteins. Each family could also be characterized by specific motifs. One family (Family 1) was composed of gdh genes from six eubacteria and six lower eukaryotes (the primitive protozoan Giardia lamblia, the green alga Chlorella sorokiniana, and several fungi and yeasts). The other one (Family 11) was composed of gdh genes from two eubacteria, two archaebacteria, and five higher eukaryotes (vertebrates). Reconstruction of phylogenetic trees using several parsimony and distance methods confirmed the existence of these two families. Therefore, these results reinforced our previously proposed hypothesis that two close but already different gdh genes were present in the last common ancestor to the three Ur-kingdoms (eubacteria, archaebacteria, and eukaryotes). The branching order of the different species of Family I was found to be the same whatever the method of tree reconstruction although it varied slightly according the region analyzed. Similarly, the topological positions of eubacteria and eukaryotes of Family II were independent of the method used. However, the branching of the two archaebacteria in Family II appeared to be unexpected: (1) the thermoacidophilic Sulfolobus solfataricus was found clustered with the two eubacteria of this family both in parsimony and distance trees, a situation not predicted by either one of the contradictory trees recently proposed; and (2) the branching of the halophilic Halobacterium salinarium varied according to the method of tree construction: it was closer to the eubacteria in the maximum parsimony tree and to eukaryotesin distance trees. Therefore, whatever the actual position of the halophilic species, archaebacteria did not appear to be monophyletic in these gdh gene trees. This result questions the firmness of the presently accepted interpretation of previous protein trees which were supposed to root unambiguously the universal tree of life and place the archaebacteria in this tree. Offprint requests to: B. Labedan  相似文献   

17.
A graphical method for detecting recombination in phylogenetic data sets   总被引:9,自引:3,他引:6  
Current phylogenetic tree reconstruction methods assume that there is a single underlying tree topology for all sites along the sequence. The presence of mosaic sequences due to recombination violates this assumption and will cause phylogenetic methods to give misleading results due to the imposition of a single tree topology on all sites. The detection of mosaic sequences caused by recombination is therefore an important first step in phylogenetic analysis. A graphical method for the detection of recombination, based on the least squares method of phylogenetic estimation, is presented here. This method locates putative recombination breakpoints by moving a window along the sequence. The performance of the method is assessed by simulation and by its application to a real data set.   相似文献   

18.
The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the second in a series of papers devoted to the analysis and comparison of metrics for tree-child time consistent phylogenetic networks on the same set of taxa. In this paper, we generalize to phylogenetic networks two metrics that have already been introduced in the literature for phylogenetic trees: the nodal distance and the triplets distance. We prove that they are metrics on any class of tree-child time consistent phylogenetic networks on the same set of taxa, as well as some basic properties for them. To prove these results, we introduce a reduction/expansion procedure that can be used not only to establish properties of tree-child time consistent phylogenetic networks by induction, but also to generate all tree-child time consistent phylogenetic networks with a given number of leaves.  相似文献   

19.
Phenotypic behavior of a group of organisms can be studied using a range of molecular evolutionary tools that help to determine evolutionary relationships. Traditionally a gene or a set of gene sequences was used for generating phylogenetic trees. Incomplete evolutionary information in few selected genes causes problems in phylogenetic tree construction. Whole genomes are used as remedy. Now, the task is to identify the suitable parameters to extract the hidden information from whole genome sequences that truly represent evolutionary information. In this study we explored a random anchor (a stretch of 100 nucleotides) based approach (ABWGP) for finding distance between any two genomes, and used the distance estimates to compute evolutionary trees. A number of strains and species of Mycobacteria were used for this study. Anchor-derived parameters, such as cumulative normalized score, anchor order and indels were computed in a pair-wise manner, and the scores were used to compute distance/phylogenetic trees. The strength of branching was determined by bootstrap analysis. The terminal branches are clearly discernable using the distance estimates described here. In general, different measures gave similar trees except the trees based on indels. Overall the tree topology reflected the known biology of the organisms. This was also true for different strains of Escherichia coli. A new whole genome-based approach has been described here for studying evolutionary relationships among bacterial strains and species.  相似文献   

20.
Supertree methods construct trees on a set of taxa (species) combining many smaller trees on the overlapping subsets of the entire set of taxa. A ‘quartet’ is an unrooted tree over taxa, hence the quartet-based supertree methods combine many -taxon unrooted trees into a single and coherent tree over the complete set of taxa. Quartet-based phylogeny reconstruction methods have been receiving considerable attentions in the recent years. An accurate and efficient quartet-based method might be competitive with the current best phylogenetic tree reconstruction methods (such as maximum likelihood or Bayesian MCMC analyses), without being as computationally intensive. In this paper, we present a novel and highly accurate quartet-based phylogenetic tree reconstruction method. We performed an extensive experimental study to evaluate the accuracy and scalability of our approach on both simulated and biological datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号