期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Phylogenetic inference in Rafflesiales: the influence of rate heterogeneity and horizontal gene transfer 总被引：1，自引：0，他引：1

Daniel?L?Nickrent Email author Albert?Blarer Yin-Long?Qiu Romina?Vidal-Russell Frank?E?Anderson 《BMC evolutionary biology》2004,4(1):40

Background

The phylogenetic relationships among the holoparasites of Rafflesiales have remained enigmatic for over a century. Recent molecular phylogenetic studies using the mitochondrial matR gene placed Rafflesia, Rhizanthes and Sapria (Rafflesiaceae s. str.) in the angiosperm order Malpighiales and Mitrastema (Mitrastemonaceae) in Ericales. These phylogenetic studies did not, however, sample two additional groups traditionally classified within Rafflesiales (Apodantheaceae and Cytinaceae). Here we provide molecular phylogenetic evidence using DNA sequence data from mitochondrial and nuclear genes for representatives of all genera in Rafflesiales. 相似文献

2.

TOPO6: a nuclear single-copy gene for plant phylogenetic inference

Frank R. Blattner 《Plant Systematics and Evolution》2016,302(2):239-244

相似文献

3.

Site-specific time heterogeneity of the substitution process and its impact on phylogenetic inference

Béatrice Roure Hervé Philippe 《BMC evolutionary biology》2011,11(1):17

Background

Model violations constitute the major limitation in inferring accurate phylogenies. Characterizing properties of the data that are not being correctly handled by current models is therefore of prime importance. One of the properties of protein evolution is the variation of the relative rate of substitutions across sites and over time, the latter is the phenomenon called heterotachy. Its effect on phylogenetic inference has recently obtained considerable attention, which led to the development of new models of sequence evolution. However, thus far focus has been on the quantitative heterogeneity of the evolutionary process, thereby overlooking more qualitative variations. 相似文献

4.

On the correlation between composition and site-specific evolutionary rate: implications for phylogenetic inference

Gowri-Shankar V Rattray M 《Molecular biology and evolution》2006,23(2):352-364

Model-based phylogenetic reconstruction methods traditionally assume homogeneity of nucleotide frequencies among sequence sites and lineages. Yet, heterogeneity in base composition is a characteristic shared by most biological sequences. Compositional variation in time, reflected in the compositional biases among contemporary sequences, has already been extensively studied, and its detrimental effects on phylogenetic estimates are known. However, fewer studies have focused on the effects of spatial compositional heterogeneity within genes. We show here that different sites in an alignment do not always share a unique compositional pattern, and we provide examples where nucleotide frequency trends are correlated with the site-specific rate of evolution in RNA genes. Spatial compositional heterogeneity is shown to affect the estimation of evolutionary parameters. With standard phylogenetic methods, estimates of equilibrium frequencies are found to be biased towards the composition observed at fast-evolving sites. Conversely, the ancestral composition estimates of some time-heterogeneous but spatially homogeneous methods are found to be biased towards frequencies observed at invariant and slow-evolving sites. The latter finding challenges the result of a previous study arguing against a hyperthermophilic last universal ancestor from the low apparent G + C content of its rRNA sequences. We propose a new model to account for compositional variation across sites. A Gaussian process prior is used to allow for a smooth change in composition with evolutionary rate. The model has been implemented in the phylogenetic inference software PHASE, and Bayesian methods can be used to obtain the model parameters. The results suggest that this model can accurately capture the observed trends in present-day RNA sequences. 相似文献

5.

Towards an inclusive philosophy for phylogenetic inference

Faith DP Trueman JW 《Systematic biology》2001,50(3):331-350

We defend and expand on our earlier proposal for an inclusive philosophical framework for phylogenetics, based on an interpretation of Popperian corroboration that is decoupled from the popular falsificationist interpretation of Popperian philosophy. Any phylogenetic inference method can provide Popperian "evidence" or "test statements" based on the method's goodness-of-fit values for different tree hypotheses. Corroboration, or the severity of that test, requires that the evidence is improbable without the hypothesis, given only background knowledge that includes elements of chance. This framework contrasts with attempted Popperian justifications for cladistic parsimony--in which evidence is the data, background knowledge is restricted to descent with modification, and "corroboration," as a by-product of nonfalsification, is to be measured by cladistic parsimony. Recognition that cladistic "corroboration" reflects only goodness-of-fit, not corroboration/severity, makes it clear that standard cladistic prohibitions, such as restrictions on the evolutionary models to be included in "background knowledge," have no philosophical status. The capacity to assess Popperian corroboration neither justifies nor excludes any phylogenetic method, but it does provide a framework in phylogenetics for learning from errors--cases where apparent good evidence is probable even without the hypothesis. We explore these issues in the context of corroboration assessments applied to likelihood methods and to a new form of parsimony. These different forms of evidence and corroboration assessment point also to a new way to combine evidence--not at the level of overall fit, but at the level of overall corroboration/severity. We conclude that progress in an inclusive phylogenetics will be well served by the rejection of cladistic philosophy. 相似文献

6.

Designing and optimizing comparative anchor primers for comparative gene mapping and phylogenetic inference

Murphy WJ O'Brien SJ 《Nature protocols》2007,2(11):3022-3030

Here we describe protocols for designing, optimizing and implementing conserved anchor primers for use in genome mapping or phylogenetic applications, with particular emphasis on homologous gene sequences among mammals. The increasing number of whole genome sequences in public databases makes this approach applicable across a wide range of taxa. Genome sequences from representatives of two or more divergent subclades within a taxonomic group of interest are used to identify candidate local alignments (i.e., exons, exons spanning introns or conserved 5'- or 3'-untranslated regions) that contain sequences with appropriate variability for the chosen downstream application. PCR primers are designed to maximize amplification success across a broad range of taxa, and are optimized under a touchdown thermocycling protocol. Based on the initial optimization results, primers are selected for application in a diverse sampling of species, or for mapping the genome of a target species of interest. We discuss factors that have to be considered for experimental design of broad-scope phylogenetic studies. With this protocol, primers can be designed, optimized and implemented within as little as 1-2 weeks. 相似文献

7.

Sample size for a phylogenetic inference. 总被引：1，自引：0，他引：1

G A Churchill A von Haeseler W C Navidi 《Molecular biology and evolution》1992,9(4):753-769

The objective of this work is to describe sample-size calculations for the inference of a nonzero central branch length in an unrooted four-species phylogeny. Attention is restricted to independent binary characters, such as might be obtained from an alignment of the purine-pyrimidine sequences of a nucleic acid molecule. A statistical test based on a multinomial model for character-state configurations is described. The importance of including invariable sites in models for sequence change is demonstrated, and their effect on sample size is quantified. The methods are applied to a four-species alignment of small-subunit rRNA sequences derived from two archaebacteria, a eubacteria and a eukaryote. We conclude that the information in these sequences is not sufficient to resolve the branching order of this tree. Estimates of the number of aligned nucleotide positions required to provide a reasonably powerful test are given. 相似文献

8.

Testing for treeness: lateral gene transfer,phylogenetic inference,and model selection

Joel D. Velasco Elliott Sober 《Biology & philosophy》2010,25(4):675-687

A phylogeny that allows for lateral gene transfer (LGT) can be thought of as a strictly branching tree (all of whose branches are vertical) to which lateral branches have been added. Given that the goal of phylogenetics is to depict evolutionary history, we should look for the best supported phylogenetic network and not restrict ourselves to considering trees. However, the obvious extensions of popular tree-based methods such as maximum parsimony and maximum likelihood face a serious problem—if we judge networks by fit to data alone, networks that have lateral branches will always fit the data at least as well as any network that restricts itself to vertical branches. This is analogous to the well-studied problem of overfitting data in the curve-fitting problem. Analogous problems often have analogous solutions and we propose to treat network inference as a case of model selection and use the Akaike Information Criterion (AIC). Strictly tree-like networks are more parsimonious than those that postulate lateral as well as vertical branches. This leads to the conclusion that we should not always infer LGT events whenever it would improve our fit-to-data, but should do so only when the improved fit is larger than the penalty for adding extra lateral branches. 相似文献

9.

Polytomies and Bayesian phylogenetic inference 总被引：16，自引：0，他引：16

Lewis PO Holder MT Holsinger KE 《Systematic biology》2005,54(2):241-253

Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There are, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short branch lengths and low values for non-Bayesian measures of support such as nonparametric bootstrapping. For the four-taxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible resolved tree topologies as data set size increases. This leads to the prediction that hard (or near-hard) polytomies in nature will cause unpredictable behavior in Bayesian analyses, with arbitrary resolutions of the polytomy receiving very high posterior probabilities in some cases. We present a simple solution to this problem involving a reversible-jump Markov chain Monte Carlo (MCMC) algorithm that allows exploration of all of tree space, including unresolved tree topologies with one or more polytomies. The reversible-jump MCMC approach allows prior distributions to place some weight on less-resolved tree topologies, which eliminates misleadingly high posteriors associated with arbitrary resolutions of hard polytomies. Fortunately, assigning some prior probability to polytomous tree topologies does not appear to come with a significant cost in terms of the ability to assess the level of support for edges that do exist in the true tree. Methods are discussed for applying arbitrary prior distributions to tree topologies of varying resolution, and an empirical example showing evidence of polytomies is analyzed and discussed. 相似文献

10.

Increased taxon sampling is advantageous for phylogenetic inference 总被引：1，自引：0，他引：1

Pollock DD Zwickl DJ McGuire JA Hillis DM 《Systematic biology》2002,51(4):664-671

相似文献

11.

Guided tree topology proposals for Bayesian phylogenetic inference

Höhna S Drummond AJ 《Systematic biology》2012,61(1):1-11

Increasingly, large data sets pose a challenge for computationally intensive phylogenetic methods such as Bayesian Markov chain Monte Carlo (MCMC). Here, we investigate the performance of common MCMC proposal distributions in terms of median and variance of run time to convergence on 11 data sets. We introduce two new Metropolized Gibbs Samplers for moving through "tree space." MCMC simulation using these new proposals shows faster average run time and dramatically improved predictability in performance, with a 20-fold reduction in the variance of the time to estimate the posterior distribution to a given accuracy. We also introduce conditional clade probabilities and demonstrate that they provide a superior means of approximating tree topology posterior probabilities from samples recorded during MCMC. 相似文献

12.

A structural EM algorithm for phylogenetic inference.

Nir Friedman Matan Ninio Itsik Pe'er Tal Pupko 《Journal of computational biology》2002,9(2):331-353

A central task in the study of molecular evolution is the reconstruction of a phylogenetic tree from sequences of current-day taxa. The most established approach to tree reconstruction is maximum likelihood (ML) analysis. Unfortunately, searching for the maximum likelihood phylogenetic tree is computationally prohibitive for large data sets. In this paper, we describe a new algorithm that uses Structural Expectation Maximization (EM) for learning maximum likelihood phylogenetic trees. This algorithm is similar to the standard EM method for edge-length estimation, except that during iterations of the Structural EM algorithm the topology is improved as well as the edge length. Our algorithm performs iterations of two steps. In the E-step, we use the current tree topology and edge lengths to compute expected sufficient statistics, which summarize the data. In the M-Step, we search for a topology that maximizes the likelihood with respect to these expected sufficient statistics. We show that searching for better topologies inside the M-step can be done efficiently, as opposed to standard methods for topology search. We prove that each iteration of this procedure increases the likelihood of the topology, and thus the procedure must converge. This convergence point, however, can be a suboptimal one. To escape from such "local optima," we further enhance our basic EM procedure by incorporating moves in the flavor of simulated annealing. We evaluate these new algorithms on both synthetic and real sequence data and show that for protein sequences even our basic algorithm finds more plausible trees than existing methods for searching maximum likelihood phylogenies. Furthermore, our algorithms are dramatically faster than such methods, enabling, for the first time, phylogenetic analysis of large protein data sets in the maximum likelihood framework. 相似文献

13.

On reduced amino acid alphabets for phylogenetic inference 总被引：1，自引：0，他引：1

Susko E Roger AJ 《Molecular biology and evolution》2007,24(9):2139-2150

We investigate the use of Markov models of evolution for reduced amino acid alphabets or bins of amino acids. The use of reduced amino acid alphabets can ameliorate effects of model misspecification and saturation. We present algorithms for 2 different ways of automating the construction of bins: minimizing criteria based on properties of rate matrices and minimizing criteria based on properties of alignments. By simulation, we show that in the absence of model misspecification, the loss of information due to binning is found to be insubstantial, and the use of Markov models at the binned level is found to be almost as effective as the more appropriate missing data approach. By applying these approaches to real data sets where compositional heterogeneity and/or saturation appear to be causing biased tree estimation, we find that binning can improve topological estimation in practice. 相似文献

14.

Base-compositional heterogeneity in the RAG1 locus among didelphid marsupials: implications for phylogenetic inference and the evolution of GC content

Gruber KF Voss RS Jansa SA 《Systematic biology》2007,56(1):83-96

Although theoretical studies have suggested that base-compositional heterogeneity can adversely affect phylogenetic reconstruction, only a few empirical examples of this phenomenon, mostly among ancient lineages (with divergence dates > 100 Mya), have been reported. In the course of our phylogenetic research on the New World marsupial family Didelphidae, we sequenced 2790 bp of the RAG1 exon from exemplar species of most extant genera. Phylogenetic analysis of these sequences recovered an anomalous node consisting of two clades previously shown to be distantly related based on analyses of other molecular data. These two clades show significantly increased GC content at RAG1 third codon positions, and the resulting convergence in base composition is strong enough to overwhelm phylogenetic signal from other genes (and morphology) in most analyses of concatenated datasets. This base-compositional convergence occurred relatively recently (over tens rather than hundreds of millions of years), and the affected gene region is still in a state of evolutionary disequilibrium. Both mutation rate and substitution rate are higher in GC-rich didelphid taxa, observations consistent with RAG1 sequences having experienced a higher rate of recombination in the convergent lineages. 相似文献

15.

Empirical evaluation of a prior for Bayesian phylogenetic inference

Yang Z 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2008,363(1512):4031-4039

The Bayesian method of phylogenetic inference often produces high posterior probabilities (PPs) for trees or clades, even when the trees are clearly incorrect. The problem appears to be mainly due to large sizes of molecular datasets and to the large-sample properties of Bayesian model selection and its sensitivity to the prior when several of the models under comparison are nearly equally correct (or nearly equally wrong) and are of the same dimension. A previous suggestion to alleviate the problem is to let the internal branch lengths in the tree become increasingly small in the prior with the increase in the data size so that the bifurcating trees are increasingly star-like. In particular, if the internal branch lengths are assigned the exponential prior, the prior mean mu0 should approach zero faster than 1/square root n but more slowly than 1/n, where n is the sequence length. This paper examines the usefulness of this data size-dependent prior using a dataset of the mitochondrial protein-coding genes from the baleen whales, with the prior mean fixed at mu0=0.1n(-2/3). In this dataset, phylogeny reconstruction is sensitive to the assumed evolutionary model, species sampling and the type of data (DNA or protein sequences), but Bayesian inference using the default prior attaches high PPs for conflicting phylogenetic relationships. The data size-dependent prior alleviates the problem to some extent, giving weaker support for unstable relationships. This prior may be useful in reducing apparent conflicts in the results of Bayesian analysis or in making the method less sensitive to model violations. 相似文献

16.

Bryophyte-specific primers for retrieving plastid genes suitable for phylogenetic inference

Chang Y Graham SW 《American journal of botany》2011,98(5):e109-e113

相似文献

17.

Assessment of phylogenomic and orthology approaches for phylogenetic inference

Dutilh BE van Noort V van der Heijden RT Boekhout T Snel B Huynen MA 《Bioinformatics (Oxford, England)》2007,23(7):815-824

MOTIVATION: Phylogenomics integrates the vast amount of phylogenetic information contained in complete genome sequences, and is rapidly becoming the standard for reliably inferring species phylogenies. There are, however, fundamental differences between the ways in which phylogenomic approaches like gene content, superalignment, superdistance and supertree integrate the phylogenetic information from separate orthologous groups. Furthermore, they all depend on the method by which the orthologous groups are initially determined. Here, we systematically compare these four phylogenomic approaches, in parallel with three approaches for large-scale orthology determination: pairwise orthology, cluster orthology and tree-based orthology. RESULTS: Including various phylogenetic methods, we apply a total of 54 fully automated phylogenomic procedures to the fungi, the eukaryotic clade with the largest number of sequenced genomes, for which we retrieved a golden standard phylogeny from the literature. Phylogenomic trees based on gene content show, relative to the other methods, a bias in the tree topology that parallels convergence in lifestyle among the species compared, indicating convergence in gene content. CONCLUSIONS: Complete genomes are no guarantee for good or even consistent phylogenies. However, the large amounts of data in genomes enable us to carefully select the data most suitable for phylogenomic inference. In terms of performance, the superalignment approach, combined with restrictive orthology, is the most successful in recovering a fungal phylogeny that agrees with current taxonomic views, and allows us to obtain a high-resolution phylogeny. We provide solid support for what has grown to be a common practice in phylogenomics during its advance in recent years. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. 相似文献

18.

Construction of linear invariants in phylogenetic inference.

Y X Fu W H Li 《Mathematical biosciences》1992,109(2):201-228

An analytical method is presented for constructing linear invariants. All linear invariants of a k-species tree can be derived from those of (k-1)-species trees using this method. The new method is simpler than that of Cavender, which relies on numerical computations. Moreover, the new method provides a convenient tool to study the relationships between linear invariants of the same tree or of different trees. All linear invariants of trees of up to five species are derived in this study. For four species, there are 16 independent linear invariants for each of the three possible unrooted trees, 14 of which are shared by two unrooted trees and 12 of these are shared by all three unrooted trees; the last types of linear invariants can be used to construct tests on the assumptions about nucleotide substitutions. The number of linear invariants for a tree is found to increase rapidly with the number of species. 相似文献

19.

Effect of genetic convergence on phylogenetic inference

Christin PA Besnard G Edwards EJ Salamin N 《Molecular phylogenetics and evolution》2012,62(3):921-927

Phylogenetic reconstructions are a major component of many studies in evolutionary biology, but their accuracy can be reduced under certain conditions. Recent studies showed that the convergent evolution of some phenotypes resulted from recurrent amino acid substitutions in genes belonging to distant lineages. It has been suggested that these convergent substitutions could bias phylogenetic reconstruction toward grouping convergent phenotypes together, but such an effect has never been appropriately tested. We used computer simulations to determine the effect of convergent substitutions on the accuracy of phylogenetic inference. We show that, in some realistic conditions, even a relatively small proportion of convergent codons can strongly bias phylogenetic reconstruction, especially when amino acid sequences are used as characters. The strength of this bias does not depend on the reconstruction method but varies as a function of how much divergence had occurred among the lineages prior to any episodes of convergent substitutions. While the occurrence of this bias is difficult to predict, the risk of spurious groupings is strongly decreased by considering only 3rd codon positions, which are less subject to selection, as long as saturation problems are not present. Therefore, we recommend that, whenever possible, topologies obtained with amino acid sequences and 3rd codon positions be compared to identify potential phylogenetic biases and avoid evolutionarily misleading conclusions. 相似文献

20.

MRBAYES: Bayesian inference of phylogenetic trees 总被引：108，自引：0，他引：108

Huelsenbeck JP Ronquist F 《Bioinformatics (Oxford, England)》2001,17(8):754-755

SUMMARY: The program MRBAYES performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo. AVAILABILITY: MRBAYES, including the source code, documentation, sample data files, and an executable, is available at http://brahms.biology.rochester.edu/software.html. 相似文献