首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary Based on mitochondrial DNA (mt-DNA) sequence data from a wide range of primate species, branching order in the evolution of primates was inferred by the maximum likelihood method of Felsenstein without assuming rate constancy among lineages. Bootstrap probabilities for being the maximum likelihood tree topology among alternatives were estimated without performing a maximum likelihood estimation for each resampled data set. Variation in the evolutionary rate among lineages was examined for the maximum likelihood tree by a method developed by Kishino and Hasegawa. From these analyses it appears that the transition rate of mtDNA evolution in the lemur has been extremely low, only about 1/10 that in other primate lines, whereas the transversion rate does not differ significantly from that of other primates. Furthermore, the transition rate in catarrhines, except the gibbon, is higher than those in the tarsier and in platyrrhines, and the transition rate in the gibbon is lower than those in other catarrhines. Branching dates in primate evolution were estimated by a molecular clock analysis of mtDNA, taking into account the rate of variation among different lines, and the results were compared with those estimated from nuclear DNA. Under the most likely model, where the evolutionary rate of mtDNA has been unifrom within a great apes/human calde, human/chimpanzee clustering is preferred to the alternative branching orders among human, chimpanzee, and gorilla.  相似文献   

2.
Three Markov models (Dayhoff, Proportional and Poisson models; Hasegawa et al., 1992a) for amino acid substitution during evolution were used for maximum likelihood analyses of proteins coded for in mitochondrial DNA in estimating a phylogenetic tree among human, bovine and murids (mouse and rat) with chicken as an outgroup. It turned out that Dayhoff model is the most appropriate model among the alternatives in approximating the amino acid substitutions of proteins coded for in mitochondrial DNA. In spite of the presence of the complete sequence data of mitochondrial genomes, we could not resolve the trichotomy among human, bovine and murids, probably because the time length separating two branching events among these three lines was short and because chicken is too distant from mammals to be used as an outgroup. It was suggested that the average substitution rate of amino acids coded for in mitochondrial DNA is lower along the bovine line than those along the human or murid lines. Advantages of amino acid sequence analysis over nucleotide sequence analysis in phylogenetic study were discussed.  相似文献   

3.
Summary The efficiency of obtaining the correct tree by the maximum likelihood method (Felsenstein 1981) for inferring trees from DNA sequence data was compared with trees obtained by distance methods. It was shown that the maximum likelihood method is superior to distance methods in the efficiency particularly when the evolutionary rate differs among lineages.  相似文献   

4.
Summary By using complete sequence data of mitochondrial DNAs, three Markov models (Day-hoff, Proportional, and Poisson models) for amino acid substitutions during evolution were applied in maximum likelihood analyses of mitochondrially encoded proteins to estimate a phylogenetic tree depicting human, cow, whale, and murids (mouse and rat), with chicken, frog, and carp as outgroups. A cow/whale clade was confirmed with a more than 99.8% confidence level by any of the three models, but the branching order among human, murids, and the cow/whale clade remained uncertain. It turned out that the Dayhoff model is by far the most appropriate model among the alternatives in approximating the amino acid substitutions of mitochondrially encoded proteins, which is consistent with a previous analysis of a more limited data set. It was shown that the substitution rate of mitochondrially encoded proteins has increased in the order of fishes, amphibians, birds, and mammals and that the rate in mammals is at least six times, probably an order of magnitude, higher than that in fishes. The higher evolutionary rate in birds and mammals than in amphibians and fishes was attributed to relaxation of selective constraints operating on proteins in warm-blooded vertebrates and to high mutation rate of bird and mammalian mitochondrial DNAs.Offprint requests to: M. Hasegawa  相似文献   

5.
A new method is presented for inferring evolutionary trees using nucleotide sequence data. The birth-death process is used as a model of speciation and extinction to specify the prior distribution of phylogenies and branching times. Nucleotide substitution is modeled by a continuous-time Markov process. Parameters of the branching model and the substitution model are estimated by maximum likelihood. The posterior probabilities of different phylogenies are calculated and the phylogeny with the highest posterior probability is chosen as the best estimate of the evolutionary relationship among species. We refer to this as the maximum posterior probability (MAP) tree. The posterior probability provides a natural measure of the reliability of the estimated phylogeny. Two example data sets are analyzed to infer the phylogenetic relationship of human, chimpanzee, gorilla, and orangutan. The best trees estimated by the new method are the same as those from the maximum likelihood analysis of separate topologies, but the posterior probabilities are quite different from the bootstrap proportions. The results of the method are found to be insensitive to changes in the rate parameter of the branching process. Correspondence to: Z. Yang  相似文献   

6.
Summary A maximum likelihood method for inferring protein phylogeny was developed. It is based on a Markov model that takes into account the unequal transition probabilities among pairs of amino acids and does not assume constancy of rate among different lineages. Therefore, this method is expected to be powerful in inferring phylogeny among distantly related proteins, either orthologous or parallogous, where the evolutionary rate may deviate from constancy. Not only amino acid substitutions but also insertion/deletion events during evolution were incorporated into the Markov model. A simple method for estimating a bootstrap probability for the maximum likelihood tree among alternatives without performing a maximum likelihood estimation for each resampled data set was developed. These methods were applied to amino acid sequence data of a photosynthetic membrane protein,psbA, from photosystem II, and the phylogeny of this protein was discussed in relation to the origin of chloroplasts.  相似文献   

7.
A maximum likelihood framework for estimating site-specific substitution rates is presented that does not require any prior assumptions about the rate distribution. We show that, when the branching pattern of the underlying tree is known, the analysis of pairs of positions is sufficient to estimate site-specific rates. In the abscense of a known topology, we introduce an iterative procedure to estimate simultaneously the branching pattern, the branch lengths, and site-specific substitution rates. Simulations show that the evolutionary rate of fast-evolving sites can be reliably inferred and that the accuracy of rate estimates depends mainly on the number of sequences in the data set. Thus, large sets of aligned sequences are necessary for reliable site-specific rate estimates. The method is applied to the complete mitochondrial DNA sequence of 53 humans, providing a complete picture of the site-specific substitution rates in human mitochondrial DNA.  相似文献   

8.
Summary A mathematical theory for computing the probabilities of various nucleotide configurations among related species is developed, and the probability of obtaining the correct tree (topology) from nucleotide sequence data is evaluated using models of evolutionary trees that are close to the tree of mitochondrial DNAs from human, chimpanzee, gorilla, orangutan, and gibbon. Special attention is given to the number of nucleotides required to resolve the branching order among the three most closely related organisms (human, chimpanzee, and gorilla). If the extent of DNA divergence is close to that obtained by Brown et al. for mitochondrial DNA and if sequence data are available only for the three most closely related organisms, the number of nucleotides (m*) required to obtain the correct tree with a probability of 95% is about 4700. If sequence data for two outgroup species (orangutan and gibbon) are available, m* becomes about 2600–2700 when the transformed distance, distance-Wagner, maximum parsimony, or compatibility method is used. In the unweighted pair-group method, m* is not affected by the availability of data from outgroup species. When these five different tree-making methods, as well as Fitch and Margoliash's method, are applied to the mitochondrial DNA data (1834 bp) obtained by Brown et al. and by Hixson and Brown, they all give the same phylogenetic tree, in which human and chimpanzee are most closely related. However, the trees considered here are gene trees, and to obtain the correct species tree, sequence data for several independent loci must be used.  相似文献   

9.
A statistical test of phylogenies estimated from sequence data   总被引:4,自引:0,他引:4  
A simple approach to testing the significance of the branching order, estimated from protein or DNA sequence data, of three taxa is proposed. The branching order is inferred by the transformed-distance method, under the assumption that one or two outgroups are available, and the branch lengths are estimated by the least-squares method. The inferred branching order is considered significant if the estimated internodal distance is significantly greater than zero. To test this, a formula for the variance of the internodal distance has been developed. The statistical test proposed has been checked by computer simulation. The same test also applies to the case of four taxa with no outgroup, if one considers an unrooted tree. Formulas for the variances of internodal distances have also been developed for the case of five taxa. Conditions are given under which it is more efficient to add the sequence of a fifth taxon than to do 25% more nucleotide sequencing in each of the original four. A method is presented for combining analyses of disparate data to get a single P value. Finally, the test, applied to the human-chimpanzee-gorilla problem, shows that the issue is not yet resolved.  相似文献   

10.
Statistical methods for computing the standard errors of the branching points of an evolutionary tree are developed. These methods are for the unweighted pair-group method-determined (UPGMA) trees reconstructed from molecular data such as amino acid sequences, nucleotide sequences, restriction-sites data, and electrophoretic distances. They were applied to data for the human, chimpanzee, gorilla, orangutan, and gibbon species. Among the four different sets of data used, DNA sequences for an 895-nucleotide segment of mitochondrial DNA (Brown et al. 1982) gave the most reliable tree, whereas electrophoretic data (Bruce and Ayala 1979) gave the least reliable one. The DNA sequence data suggested that the chimpanzee is the closest and that the gorilla is the next closest to the human species. The orangutan and gibbon are more distantly related to man than is the gorilla. This topology of the tree is in agreement with that for the tree obtained from chromosomal studies and DNA-hybridization experiments. However, the difference between the branching point for the human and the chimpanzee species and that for the gorilla species and the human-chimpanzee group is not statistically significant. In addition to this analysis, various factors that affect the accuracy of an estimated tree are discussed.   相似文献   

11.
Summary Phylogenetic analyses of ribosomal RNA sequences have played an important role in the study of early evolution of life. However, Loomis and Smith suggested that the ribosomal RNA tree is sometimes misleading—especially when G+C content differs widely among lineages—and that a protein tree from amino acid sequences may be more reliable. In this study, we analyzed amino acid sequence data of elongation factor-1 by a maximum likelihood method to clarify branching orders in the early evolution of eukaryotes. Contrary to Sogin et al.'s tree of small-subunit ribosomal RNA, a protozoan species, Entamoeba histolytica, that lacks mitochondria was shown to have diverged from the line leading to eukaryotes with mitochondria before the latter separated into several kingdoms. This indicates that Entamoeba is a living relic of the earliest phase of eukaryotic evolution before the symbiosis of protomitochondria occurred. Furthermore, this suggests that, among eukaryotic kingdoms with mitochondria, Fungi is the closest relative of Animalia, and that a cellular slime mold, Dictyostelium discoideum, had not diverged from the line leading to Plantae-Fungi-Animalia before these three kingdoms separated. Offprint requests to: M. Hasegawa  相似文献   

12.
An important issue in the phylogenetic analysis of nucleotide sequence data using the maximum likelihood (ML) method is the underlying evolutionary model employed. We consider the problem of simultaneously estimating the tree topology and the parameters in the underlying substitution model and of obtaining estimates of the standard errors of these parameter estimates. Given a fixed tree topology and corresponding set of branch lengths, the ML estimates of standard evolutionary model parameters are asymptotically efficient, in the sense that their joint distribution is asymptotically normal with the variance–covariance matrix given by the inverse of the Fisher information matrix. We propose a new estimate of this conditional variance based on estimation of the expected information using a Monte Carlo sampling (MCS) method. Simulations are used to compare this conditional variance estimate to the standard technique of using the observed information under a variety of experimental conditions. In the case in which one wishes to estimate simultaneously the tree and parameters, we provide a bootstrapping approach that can be used in conjunction with the MCS method to estimate the unconditional standard error. The methods developed are applied to a real data set consisting of 30 papillomavirus sequences. This overall method is easily incorporated into standard bootstrapping procedures to allow for proper variance estimation.  相似文献   

13.
Summary The amino acid sequences of the largest subunits of the RNA polymerases I, II, and III from eukaryotes were compared with those of archaebacterial and eubacterial homologs, and their evolutionary relationships were analyzed in detail by a recently developed tree-making method, the likelihood method of protein phylogeny, as well as by the neighbor-joining method and the parsimony method, together with bootstrap analyses. It was shown that the best tree topologies predicted by the first two methods are identical, whereas the last one predicts a distinct tree. The maximum likelihood tree revealed that, after the separation from archaebacteria, the three eukaryotic RNA polymerases diverged from an ancestral precursor in the eukaryotic lineage. This result is contrasted with the published result showing multiple origins for the three eukaryotic polymerases. It was shown that eukaryotic RNA polymerase I evolved much more rapidly than RNA polymerases II and III: The N-terminal half of RNA polymerase I shows an extraordinarily high evolutionary rate, possibly due to relaxed functional constraints. In contrast the evolutionary rate of archaebacterial RNA polymerase is remarkably limited. In addition, including the second largest subunit of the RNA polymerase, a detailed analysis for the branching pattern of the three major groups of archaebacteria was carried out by the maximum likelihood method. It was shown that the three major groups of archaebacteria are likely to form a single cluster; that is, archaebacteria are likely to be monophyletic as originally proposed by Woese and his colleagues.  相似文献   

14.
A maximum likelihood method for independently estimating the relative rate of substitution at different nucleotide sites is presented. With this method, the evolution of DNA sequences can be analyzed without assuming a specific distribution of rates among sites. To investigate the pattern of correlation of rates among sites, the method was applied to a data set consisting of the protein-coding regions of the mitochondrial genome from 10 vertebrate species. Rates appear to be strongly correlated at distances up to 40 codons apart. Furthermore, there appears to be some higher order correlation of sites approximately 75 codons apart. The method of site-by-site estimation of the rate of substitution may also be applied to examine other aspects of rate variation along a DNA sequence and to assess the difference in the support of a tree along the sequence.  相似文献   

15.
Conventional phylogenetic tree estimation methods assume that all sites in a DNA multiple alignment have the same evolutionary history. This assumption is violated in data sets from certain bacteria and viruses due to recombination, a process that leads to the creation of mosaic sequences from different strains and, if undetected, causes systematic errors in phylogenetic tree estimation. In the current work, a hidden Markov model (HMM) is employed to detect recombination events in multiple alignments of DNA sequences. The emission probabilities in a given state are determined by the branching order (topology) and the branch lengths of the respective phylogenetic tree, while the transition probabilities depend on the global recombination probability. The present study improves on an earlier heuristic parameter optimization scheme and shows how the branch lengths and the recombination probability can be optimized in a maximum likelihood sense by applying the expectation maximization (EM) algorithm. The novel algorithm is tested on a synthetic benchmark problem and is found to clearly outperform the earlier heuristic approach. The paper concludes with an application of this scheme to a DNA sequence alignment of the argF gene from four Neisseria strains, where a likely recombination event is clearly detected.  相似文献   

16.
Time of the deepest root for polymorphism in human mitochondrial DNA   总被引:7,自引:0,他引:7  
Summary A molecular clock analysis was carried out on the nucleotide sequences of parts of the major noncoding region of mitochondrial DNA (mtDNA) from the major geographic populations of humans. Dates of branchings in the mtDNA tree among humans were estimated with an improved maximum likelihood method. Two species of chimpanzees were used as an outgroup, and the mtDNA clock was calibrated by assuming that the chimpanzee/human split occurred 4 million years ago, following our earlier works. A model of homogeneous evolution among sites does not fit well with the data even within hypervariable segments, and hence an additional parameter that represents a proportion of variable sites was introduced. Taking account of this heterogeneity among sites, the date for the deepest root of the mtDNA tree among humans was estimated to be 280,000±50,000 years old (±1 SE), although there remains uncertainty about the constancy of the evolutionary rate among lineages. The evolutionary rate of the most rapidly evolving sites in mtDNA was estimated to be more than 100 times greater than that of a nuclear pseudogene.  相似文献   

17.
Cataloging the very large number of undescribed species of insects could be greatly accelerated by automated DNA based approaches, but procedures for large-scale species discovery from sequence data are currently lacking. Here, we use mitochondrial DNA variation to delimit species in a poorly known beetle radiation in the genus Rivacindela from arid Australia. Among 468 individuals sampled from 65 sites and multiple morphologically distinguishable types, sequence variation in three mtDNA genes (cytochrome oxidase subunit 1, cytochrome b, 16S ribosomal RNA) was strongly partitioned between 46 or 47 putative species identified with quantitative methods of species recognition based on fixed unique ("diagnostic") characters. The boundaries between groups were also recognizable from a striking increase in branching rate in clock-constrained calibrated trees. Models of stochastic lineage growth (Yule models) were combined with coalescence theory to develop a new likelihood method that determines the point of transition from species-level (speciation and extinction) to population-level (coalescence) evolutionary processes. Fitting the location of the switches from speciation to coalescent nodes on the ultrametric tree of Rivacindela produced a transition in branching rate occurring at 0.43 Mya, leading to an estimate of 48 putative species (confidence interval for the threshold ranging from 47 to 51 clusters within 2 logL units). Entities delimited in this way exhibited biological properties of traditionally defined species, showing coherence of geographic ranges, broad congruence with morphologically recognized species, and levels of sequence divergence typical for closely related species of insects. The finding of discontinuous evolutionary groupings that are readily apparent in patterns of sequence variation permits largely automated species delineation from DNA surveys of local communities as a scaffold for taxonomy in this poorly known insect group.  相似文献   

18.
MOTIVATION: A large, high-quality database of homologous sequence alignments with good estimates of their corresponding phylogenetic trees will be a valuable resource to those studying phylogenetics. It will allow researchers to compare current and new models of sequence evolution across a large variety of sequences. The large quantity of data may provide inspiration for new models and methodology to study sequence evolution and may allow general statements about the relative effect of different molecular processes on evolution. RESULTS: The Pandit 7.6 database contains 4341 families of sequences derived from the seed alignments of the Pfam database of amino acid alignments of families of homologous protein domains (Bateman et al., 2002). Each family in Pandit includes an alignment of amino acid sequences that matches the corresponding Pfam family seed alignment, an alignment of DNA sequences that contain the coding sequence of the Pfam alignment when they can be recovered (overall, 82.9% of sequences taken from Pfam) and the alignment of amino acid sequences restricted to only those sequences for which a DNA sequence could be recovered. Each of the alignments has an estimate of the phylogenetic tree associated with it. The tree topologies were obtained using the neighbor joining method based on maximum likelihood estimates of the evolutionary distances, with branch lengths then calculated using a standard maximum likelihood approach.  相似文献   

19.
Summary The maximum likelihood (ML) method for constructing phylogenetic trees (both rooted and unrooted trees) from DNA sequence data was studied. Although there is some theoretical problem in the comparison of ML values conditional for each topology, it is possible to make a heuristic argument to justify the method. Based on this argument, a new algorithm for estimating the ML tree is presented. It is shown that under the assumption of a constant rate of evolution, the ML method and UPGMA always give the same rooted tree for the case of three operational taxonomic units (OTUs). This also seems to hold approximately for the case with four OTUs. When we consider unrooted trees with the assumption of a varying rate of nucleotide substitution, the efficiency of the ML method in obtaining the correct tree is similar to those of the maximum parsimony method and distance methods. The ML method was applied to Brown et al.'s data, and the tree topology obtained was the same as that found by the maximum parsimony method, but it was different from those obtained by distance methods.  相似文献   

20.
Summary We have carried out a phylogenetic study of the evolution of the VP1 gene sequence from different serological types and subtypes of foot-and-mouth disease virus (FMDV). The maximum-likelihood method developed by Hasegawa and co-workers (Hasegawa et al. 1985) for the estimation of evolutionary parameters and branching dates has been used to decide between alternative models of evolution: constant versus variable rates. The results obtained indicate that a constant rate model, i.e., a molecular clock, seems to be the most plausible one. However, additional information suggests the possibility that the appearance of serotype CS has been accompanied by an episode of rapid evolution (Villaverde et al. 1991). We discuss the possibility that this evolution of RNA viruses was due to episodic positive Darwinian selection, which would have helped the new variant to escape the immunogenic pressure from the hosts. Offprint requests to: A. Moya  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号