共查询到20条相似文献,搜索用时 0 毫秒
1.
JOSEPH FELSENSTEIN 《Biological journal of the Linnean Society. Linnean Society of London》2000,16(3):183-196
The statistical framework of maximum likelihood estimation is used to examine character weighting in inferring phylogenies. A simple probabilistic model of evolution is used, in which each character evolves independently among two states, and different lineages evolve independently. When different characters have different known probabilities of change, all sufficiently small, the proper maximum likelihood method of estimating phylogenies is a weighted parsimony method in which the weights are logarithmically related to the rates of change. When rates of change are taken extremely small, the weights become more equal and unweighted parsimony methods are obtained.
When it is known that a few characters have very high rates of change and the rest very low rates, but it is not known which characters are the ones having the high rates, the maximum likelihood criterion supports use of compatibility methods. By varying the fraction of characters believed to have high rates of change one obtains a 'threshold method' whose behavior depends on the value of a parameter. By altering this parameter the method changes smoothly from being a parsimony method to being a compatibility method. This provides us with a spectrum of intermediates between these methods. These intermediate methods may be of use in analysing real data. 相似文献
When it is known that a few characters have very high rates of change and the rest very low rates, but it is not known which characters are the ones having the high rates, the maximum likelihood criterion supports use of compatibility methods. By varying the fraction of characters believed to have high rates of change one obtains a 'threshold method' whose behavior depends on the value of a parameter. By altering this parameter the method changes smoothly from being a parsimony method to being a compatibility method. This provides us with a spectrum of intermediates between these methods. These intermediate methods may be of use in analysing real data. 相似文献
2.
Maximum likelihood outperforms maximum parsimony even when evolutionary rates are heterotachous 总被引:4,自引:0,他引:4
Heterotachy occurs when the relative evolutionary rates among sites are not the same across lineages. Sequence alignments are likely to exhibit heterotachy with varying severity because the intensity of purifying selection and adaptive forces at a given amino acid or DNA sequence position is unlikely to be the same in different species. In a recent study, the influence of heterotachy on the performance of different phylogenetic methods was examined using computer simulation for a four-species phylogeny. Maximum parsimony (MP) was reported to generally outperform maximum likelihood (ML). However, our comparisons of MP and ML methods using the methods and evaluation criteria employed in that study, but considering the possible range of proportions of sites involved in heterotachy, contradict their findings and indicate that, in fact, ML is significantly superior to MP even under heterotachy. 相似文献
3.
Phylogenetic analysis using parsimony and likelihood methods 总被引:1,自引:0,他引:1
Ziheng Yang 《Journal of molecular evolution》1996,42(2):294-307
The assumptions underlying the maximum-parsimony (MP) method of phylogenetic tree reconstruction were intuitively examined by studying the way the method works. Computer simulations were performed to corroborate the intuitive examination. Parsimony appears to involve very stringent assumptions concerning the process of sequence evolution, such as constancy of substitution rates between nucleotides, constancy of rates across nucleotide sites, and equal branch lengths in the tree. For practical data analysis, the requirement of equal branch lengths means similar substitution rates among lineages (the existence of an approximate molecular clock), relatively long interior branches, and also few species in the data. However, a small amount of evolution is neither a necessary nor a sufficient requirement of the method. The difficulties involved in the application of current statistical estimation theory to tree reconstruction were discussed, and it was suggested that the approach proposed by Felsenstein (1981,J. Mol. Evol. 17: 368–376) for topology estimation, as well as its many variations and extensions, differs fundamentally from the maximum likelihood estimation of a conventional statistical parameter. Evidence was presented showing that the Felsenstein approach does not share the asymptotic efficiency of the maximum likelihood estimator of a statistical parameter. Computer simulations were performed to study the probability that MP recovers the true tree under a hierarchy of models of nucleotide substitution; its performance relative to the likelihood method was especially noted. The results appeared to support the intuitive examination of the assumptions underlying MP. When a simple model of nucleotide substitution was assumed to generate data, the probability that MP recovers the true topology could be as high as, or even higher than, that for the likelihood method. When the assumed model became more complex and realistic, e.g., when substitution rates were allowed to differ between nucleotides or across sites, the probability that MP recovers the true topology, and especially its performance relative to that of the likelihood method, generally deteriorates. As the complexity of the process of nucleotide substitution in real sequences is well recognized, the likelihood method appears preferable to parsimony. However, the development of a statistical methodology for the efficient estimation of the tree topology remains a difficult open problem. 相似文献
4.
Summary A large amount of information is contained within the phylogentic relationships between species. In addition to their branching patterns it is also possible to examine other aspects of the biology of the species. The influence that deleterious selection might have is determined here. The likelihood of different phylogenies in the presence of selection is explored to determine the properties of such a likelihood surface. The calculation of likelihoods for a phylogeny in the presence and absence of selection, permits the application of a likelihood ratio test to search for selection. It is shown that even a single selected site can have a strong effect on the likelihood. The method is illustrated with an example fromDrosophila melanogaster and suggests that delerious selection may be acting on transposable elements. 相似文献
5.
Reconstructing a tree of life by inferring evolutionary history is an important focus of evolutionary biology. Phylogenetic reconstructions also provide useful information for a range of scientific disciplines such as botany, zoology, phylogeography, archaeology and biological anthropology. Until the development of protein and DNA sequencing techniques in the 1960s and 1970s, phylogenetic reconstructions were based on fossil records and comparative morphological/physiological analyses. Since then, progress in molecular phylogenetics has compensated for some of the shortcomings of phenotype-based comparisons. Comparisons at the molecular level increase the accuracy of phylogenetic inference because there is no environmental influence on DNA/peptide sequences and evaluation of sequence similarity is not subjective. While the number of morphological/physiological characters that are sufficiently conserved for phylogenetic inference is limited, molecular data provide a large number of datapoints and enable comparisons from diverse taxa. Over the last 20 years, developments in molecular phylogenetics have greatly contributed to our understanding of plant evolutionary relationships. Regions in the plant nuclear and organellar genomes that are optimal for phylogenetic inference have been determined and recent advances in DNA sequencing techniques have enabled comparisons at the whole genome level. Sequences from the nuclear and organellar genomes of thousands of plant species are readily available in public databases, enabling researchers without access to molecular biology tools to investigate phylogenetic relationships by sequence comparisons using the appropriate nucleotide substitution models and tree building algorithms. In the present review, the statistical models and algorithms used to reconstruct phylogenetic trees are introduced and advances in the exploration and utilization of plant genomes for molecular phylogenetic analyses are discussed. 相似文献
6.
Evolutionary trees from DNA sequences: A maximum likelihood approach 总被引:129,自引:0,他引:129
Joseph Felsenstein 《Journal of molecular evolution》1981,17(6):368-376
Summary The application of maximum likelihood techniques to the estimation of evolutionary trees from nucleic acid sequence data is discussed. A computationally feasible method for finding such maximum likelihood estimates is developed, and a computer program is available. This method has advantages over the traditional parsimony algorithms, which can give misleading results if rates of evolution differ in different lineages. It also allows the testing of hypotheses about the constancy of evolutionary rates by likelihood ratio tests, and gives rough indication of the error of the estimate of the tree.By acceptance of this article, the publisher and/or recipient acknowledges the U.S. government's right to retain a nonexclusive, royalty-free licence in and to any copyright covering this paperThis report was prepared as an account of work sponsored by the United States Government. Neither the United States nor the United States Department of Energy, nor any of their employees, nor any of their contractors, subcontractors, or their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness or usefulness of any information, apparatus, product or process disclosed, or represents that its use would not infringe privately-owned rights 相似文献
7.
The Bryaceae are a large cosmopolitan moss family including genera of significant morphological and taxonomic complexity. Phylogenetic relationships within the Bryaceae were reconstructed based on DNA sequence data from all three genomic compartments. In addition, maximum parsimony and Bayesian inference were employed to reconstruct ancestral character states of 38 morphological plus four habitat characters and eight insertion/deletion events. The recovered phylogenetic patterns are generally in accord with previous phylogenies based on chloroplast DNA sequence data and three major clades are identified. The first clade comprises Bryum bornholmense, B. rubens, B. caespiticium, and Plagiobryum. This corroborates the hypothesis suggested by previous studies that several Bryum species are more closely related to Plagiobryum than to the core Bryum species. The second clade includes Acidodontium, Anomobryum, and Haplodontium, while the third clade contains the core Bryum species plus Imbribryum. Within the latter clade, B. subapiculatum and B. tenuisetum form the sister clade to Imbribryum. Reconstructions of ancestral character states under maximum parsimony and Bayesian inference suggest fourteen morphological synapomorphies for the ingroup and synapomorphies are detected for most clades within the ingroup. Maximum parsimony and Bayesian reconstructions of ancestral character states are mostly congruent although Bayesian inference shows that the posterior probability of ancestral character states may decrease dramatically when node support is taken into account. Bayesian inference also indicates that reconstructions may be ambiguous at internal nodes for highly polymorphic characters. 相似文献
8.
Background and Aims
For 84 years, botanists have relied on calculating the highest common factor for series of haploid chromosome numbers to arrive at a so-called basic number, x. This was done without consistent (reproducible) reference to species relationships and frequencies of different numbers in a clade. Likelihood models that treat polyploidy, chromosome fusion and fission as events with particular probabilities now allow reconstruction of ancestral chromosome numbers in an explicit framework. We have used a modelling approach to reconstruct chromosome number change in the large monocot family Araceae and to test earlier hypotheses about basic numbers in the family.Methods
Using a maximum likelihood approach and chromosome counts for 26 % of the 3300 species of Araceae and representative numbers for each of the other 13 families of Alismatales, polyploidization events and single chromosome changes were inferred on a genus-level phylogenetic tree for 113 of the 117 genera of Araceae.Key Results
The previously inferred basic numbers x = 14 and x = 7 are rejected. Instead, maximum likelihood optimization revealed an ancestral haploid chromosome number of n = 16, Bayesian inference of n = 18. Chromosome fusion (loss) is the predominant inferred event, whereas polyploidization events occurred less frequently and mainly towards the tips of the tree.Conclusions
The bias towards low basic numbers (x) introduced by the algebraic approach to inferring chromosome number changes, prevalent among botanists, may have contributed to an unrealistic picture of ancestral chromosome numbers in many plant clades. The availability of robust quantitative methods for reconstructing ancestral chromosome numbers on molecular phylogenetic trees (with or without branch length information), with confidence statistics, makes the calculation of x an obsolete approach, at least when applied to large clades. 相似文献9.
10.
Understanding gene duplication and gene structure evolution are fundamental goals of molecular evolutionary biology. A previous study by Babenko et al. (2004. Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic Acids Res. 32:3724-3733) employed Dollo parsimony to infer spliceosomal intron losses and gains in paralogous gene families and concluded that there was a general excess of gains over losses. This result contrasts with patterns in orthologous genes, in which most lineages show an excess of intron losses over gains, suggesting the possibility of fundamentally different modes of intron evolution between orthologous and paralogous genes. We further studied the data and found a low level of intron position conservation with outgroups, and this led to problems with using Dollo parsimony to analyze the data. Statistical reanalysis of the data suggests, instead, that intron losses have outnumbered intron gains in paralogous gene families. 相似文献
11.
A maximum likelihood approach to two-dimensional crystals 总被引:1,自引:0,他引:1
Maximum likelihood (ML) processing of transmission electron microscopy images of protein particles can produce reconstructions of superior resolution due to a reduced reference bias. We have investigated a ML processing approach to images centered on the unit cells of two-dimensional (2D) crystal images. The implemented software makes use of the predictive lattice node tracking in the MRC software, which is used to window particle stacks. These are then noise-whitened and subjected to ML processing. Resulting ML maps are translated into amplitudes and phases for further processing within the 2dx software package. Compared with ML processing for randomly oriented single particles, the required computational costs are greatly reduced as the 2D crystals restrict the parameter search space. The software was applied to images of negatively stained or frozen hydrated 2D crystals of different crystal order. We find that the ML algorithm is not free from reference bias, even though its sensitivity to noise correlation is lower than for pure cross-correlation alignment. Compared with crystallographic processing, the newly developed software yields better resolution for 2D crystal images of lower crystal quality, and it performs equally well for well-ordered crystal images. 相似文献
12.
Stamatakis A Ott M 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2008,363(1512):3977-3984
The continuous accumulation of sequence data, for example, due to novel wet-laboratory techniques such as pyrosequencing, coupled with the increasing popularity of multi-gene phylogenies and emerging multi-core processor architectures that face problems of cache congestion, poses new challenges with respect to the efficient computation of the phylogenetic maximum-likelihood (ML) function. Here, we propose two approaches that can significantly speed up likelihood computations that typically represent over 95 per cent of the computational effort conducted by current ML or Bayesian inference programs. Initially, we present a method and an appropriate data structure to efficiently compute the likelihood score on 'gappy' multi-gene alignments. By 'gappy' we denote sampling-induced gaps owing to missing sequences in individual genes (partitions), i.e. not real alignment gaps. A first proof-of-concept implementation in RAXML indicates that this approach can accelerate inferences on large and gappy alignments by approximately one order of magnitude. Moreover, we present insights and initial performance results on multi-core architectures obtained during the transition from an OpenMP-based to a Pthreads-based fine-grained parallelization of the ML function. 相似文献
13.
基于rDNA ITS序列对绒泡菌目黏菌系统发育的探讨 总被引:1,自引:0,他引:1
绒泡菌目Physarida是黏菌纲Myxogastria最大的一个目,对其系统发育关系的研究一直是根据形态特征。为了从分子水平探讨绒泡菌目乃至黏菌纲的系统发育关系,以黏菌r DNA ITS通用引物对绒泡菌目5属8种黏菌的r DNA ITS进行扩增和测序,结合Gen Bank中已有的黏菌r DNA ITS序列,利用贝叶斯推断法(Bayesian inference,BI)和最大似然法(Maximum likelihood,ML)构建系统发育树。结果表明:绒泡菌目不同物种的r DNA ITS区在碱基组成和长度上差异明显,长度为777–1 445bp,G+C mol%在53.4%–61.9%之间。绒泡菌目与发网菌目Stemonitida聚类为两个明显的分支,在绒泡菌目分支上,绒泡菌科Physaraceae和钙皮菌科Didymiaceae各聚为一支,支持了形态学上以孢丝是否具有石灰质为依据区分这两个科的观点。由多份不同地理来源的鳞钙皮菌Didymium squamulosum材料组成的钙皮菌科又形成3个分支,证实了这个形态种是由地域来源广泛、繁殖亲和性各异和遗传变异较大的不同生物种组成的复合体。 相似文献
14.
15.
16.
The evolution of the diverse flora in the Lower Volga Valley (LVV) (southwest Russia) is complex due to the composite geomorphology and tectonic history of the Caspian Sea and adjacent areas. In the absence of phylogenetic studies and temporal information, we implemented a maximum likelihood (ML) approach and stochastic character mapping reconstruction aiming at recovering historical signals from species occurrence data. A taxon-area matrix of 13 floristic areas and 1018 extant species was constructed and analyzed with RAxML and Mesquite. Additionally, we simulated scenarios with numbers of hypothetical extinct taxa from an unknown palaeoflora that occupied the areas before the dramatic transgression and regression events that have occurred from the Pleistocene to the present day. The flora occurring strictly along the river valley and delta appear to be younger than that of adjacent steppes and desert-like regions, regardless of the chronology of transgression and regression events that led to the geomorphological formation of the LVV. This result is also supported when hypothetical extinct taxa are included in the analyses. The history of each species was inferred by using a stochastic character mapping reconstruction method as implemented in Mesquite. Individual histories appear to be independent from one another and have been shaped by repeated dispersal and extinction events. These reconstructions provide testable hypotheses for more in-depth investigations of their population structure and dynamics. 相似文献
17.
Molecular evolution of pteridophytes and their relationship to seed plants: Evidence from complete 18S rRNA gene sequences 总被引:1,自引:0,他引:1
Complete 18S ribosomal RNA sequence data from representatives of all extant pteridophyte lineages together with RNA sequences from different seed plants were used to infer a molecular phylogeny of vascular plants that included all major land plant lineages. The molecular data indicate that lycopsids are monophyletic and are the earliest diverging group within the vascular land plants, whereasPsilotum nudum is more closely related to the seed plants than to other pteridophyte lineages. The phylogenetic trees based on maximum likelihood, parsimony and distance analyses show substantial agreement with the evolutionary relationships of land plants as interpreted from the fossil record. 相似文献
18.
We evaluate the performance of maximum likelihood (ML) analysis of allele frequency data in a linear array of populations. The parameters are a mutation rate and either the dispersal rate in a stepping stone model or a dispersal rate and a scale parameter in a geometric dispersal model. An approximate procedure known as maximum product of approximate conditional (PAC) likelihood is found to perform as well as ML. Mis-specification biases may occur because the importance sampling algorithm is formally defined in term of mutation and migration rates scaled by the total size of the population, and this size may differ widely in the statistical model and in reality. As could be expected, ML generally performs well when the statistical model is correctly specified. Otherwise, mutation rate estimates are much closer to mutation probability scaled by number of demes in the statistical model than scaled by number of demes in reality when mutation probability is high and dispersal is most limited. This mis-specification bias actually has practical benefits. However, opposite results are found in opposite conditions. Migration rate estimates show roughly similar trends, but they may not always be easily interpreted as low-bias estimates of dispersal rate under any scaling. Estimation of the dispersal scale parameter is also affected by mis-specification of the number of demes, and the different biases compensate each other in such a way that good estimation of the so-called neighborhood size (or more precisely the product of population density and mean-squared parent-offspring dispersal distance) is achieved. Results congruent with these findings are found in an application to a damselfly data set. 相似文献
19.
20.
AKIFUMI S. TANABE 《Molecular ecology resources》2007,7(6):962-964
The application of different substitution models to each gene (a.k.a. mixed model) should be considered in model‐based phylogenetic analysis of multigene sequences. However, a single molecular evolution model is still usually applied. There are no computer programs able to conduct model selection for multiple loci at the same time, though several recently developed types of software for phylogenetic inference can handle mixed model. Here, I have developed computer software named ‘kakusan’ that enables us to solve the above problems. Major running steps are briefly described, and an analysis of results with kakusan is compared to that obtained with other program. 相似文献