共查询到20条相似文献,搜索用时 0 毫秒
1.
Examining rates and patterns of nucleotide substitution in plants 总被引:19,自引:0,他引:19
Muse SV 《Plant molecular biology》2000,42(1):25-43
Driven by rapid improvements in affordable computing power and by the even faster accumulation of genomic data, the statistical analysis of molecular sequence data has become an active area of interdisciplinary research. Maximum likelihood methods have become mainstream because of their desirable properties and, more importantly, their potential for providing statistically sound solutions in complex data analysis settings. In this chapter, a review of recent literature focusing on rates and patterns of nucleotide substitution rates in the nuclear, chloroplast, and mitochondrial genomes of plants demonstrates the power and flexibility of these new methods. The emerging picture of the nucleotide substitution process in plants is a complex one. Evolutionary rates are seen to be quite variable, both among genes and among plant lineages. However, there are hints, particularly in the chloroplast, that individual factors can have important effects on many genes simultaneously. 相似文献
2.
Phylogenetic tree reconstruction frequently assumes the homogeneity of the substitution process over the whole tree. To test this assumption statistically, we propose a test based on the sample covariance matrix of the set of substitution rate matrices estimated from pairwise sequence comparison. The sample covariance matrix is condensed into a one-dimensional test statistic Delta = sum ln(1 + delta(i)), where delta(i) are the eigenvalues of the sample covariance matrix. The test does not assume a specific mutational model. It analyses the variation in the estimated rate matrices. The distribution of this test statistic is determined by simulations based on the phylogeny estimated from the data. We study the power of the test under various scenarios and apply the test to X chromosome and mtDNA primate sequence data. Finally, we demonstrate how to include rate variation in the test. 相似文献
3.
A new method for calculating evolutionary substitution rates 总被引:39,自引:0,他引:39
Cecilia Lanave Giuliano Preparata Cecilia Sacone Gabriella Serio 《Journal of molecular evolution》1984,20(1):86-93
Summary In this paper we present a new method for analysing molecular evolution in homologous genes based on a general stationary Markov process. The elaborate statistical analysis necessary to apply the method effectively has been performed using Monte Carlo technqiues. We have applied our method to the silent third position of the codon of the five mitochondrial genes coding for identified proteins of four mammalian species (rat, mouse, cow and man). We found that the method applies satisfactorily to the three former species, while the last appears to be outside the scope of the present approach. The method allows one to calculate the evolutionarily effective silent substitution rate (vs) for mitochondrial genes, which in the species mentioned above is 1.4×10–8 nucleotide substitutions per site per year. We have also determined the divergence time ratios between the couples mousecow/rat-mouse and rat-cow/rat-mouse. In both cases this value is approximately 1.4. 相似文献
4.
Klaere S Gesell T von Haeseler A 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2008,363(1512):4041-4047
We introduce another view of sequence evolution. Contrary to other approaches, we model the substitution process in two steps. First we assume (arbitrary) scaled branch lengths on a given phylogenetic tree. Second we allocate a Poisson distributed number of substitutions on the branches. The probability to place a mutation on a branch is proportional to its relative branch length. More importantly, the action of a single mutation on an alignment column is described by a doubly stochastic matrix, the so-called one-step mutation matrix. This matrix leads to analytical formulae for the posterior probability distribution of the number of substitutions for an alignment column. 相似文献
5.
A. P. MØLLER J. ERRITZøE F. KARADAS T. A. MOUSSEAU 《Journal of evolutionary biology》2010,23(10):2132-2142
Extreme environmental perturbations are rare, but may have important evolutionary consequences. Responses to current perturbations may provide important information about the ability of living organisms to cope with similar conditions in the evolutionary past. Radioactive contamination from Chernobyl constitutes one such extreme perturbation, with significant but highly variable impact on local population density and mutation rates of different species of animals and plants. We explicitly tested the hypothesis that species with strong impacts of radiation on abundance were those with high rates of historical mutation accumulation as reflected by cytochrome b mitochondrial DNA base‐pair substitution rates during past environmental perturbations. Using a dataset of 32 species of birds, we show higher historical mitochondrial substitution rates in species with the strongest negative impact of local levels of radiation on local population density. These effects were robust to different estimates of impact of radiation on abundance, weighting of estimates of abundance by sample size, statistical control for similarity in the response among species because of common phylogenetic descent, and effects of population size and longevity. Therefore, species that respond strongly to the impact of radiation from Chernobyl are also the species that in the past have been most susceptible to factors that have caused high substitution rates in mitochondrial DNA. 相似文献
6.
Statistical methods for detecting molecular adaptation 总被引:2,自引:0,他引:2
The past few years have seen the development of powerful statistical methods for detecting adaptive molecular evolution. These methods compare synonymous and nonsynonymous substitution rates in protein-coding genes, and regard a nonsynonymous rate elevated above the synonymous rate as evidence for darwinian selection. Numerous cases of molecular adaptation are being identified in various systems from viruses to humans. Although previous analyses averaging rates over sites and time have little power, recent methods designed to detect positive selection at individual sites and lineages have been successful. Here, we summarize recent statistical methods for detecting molecular adaptation, and discuss their limitations and possible improvements. 相似文献
7.
An analysis of determinants of amino acids substitution rates in bacterial proteins 总被引:14,自引:0,他引:14
The variation of amino acid substitution rates in proteins depends on several variables. Among these, the protein's expression level, functional category, essentiality, or metabolic costs of its amino acid residues may play an important role. However, the relative importance of each variable has not yet been evaluated in comparative analyses. To this aim, we made regression analyses combining data available on these variables and on evolutionary rates, in two well-documented model bacteria, Escherichia coli and Bacillus subtilis. In both bacteria, the level of expression of the protein in the cell was by far the most important driving force constraining the amino acids substitution rate. Subsequent inclusion in the analysis of the other variables added little further information. Furthermore, when the rates of synonymous substitutions were included in the analysis of the E. coli data, only the variable expression levels remained statistically significant. The rate of nonsynonymous substitution was shown to correlate with expression levels independently of the rate of synonymous substitution. These results suggest an important direct influence of expression levels, or at least codon usage bias for translation optimization, on the rates of nonsynonymous substitutions in bacteria. They also indicate that when a control for this variable is included, essentiality plays no significant role in the rate of protein evolution in bacteria, as is the case in eukaryotes. 相似文献
8.
Genetic sequence data typically exhibit variability in substitution rates across sites. In practice, there is often too little variation to fit a different rate for each site in the alignment, but the distribution of rates across sites may not be well modeled using simple parametric families. Mixtures of different distributions can capture more complex patterns of rate variation, but are often parameter-rich and difficult to fit. We present a simple hierarchical model in which a baseline rate distribution, such as a gamma distribution, is discretized into several categories, the quantiles of which are estimated using a discretized beta distribution. Although this approach involves adding only two extra parameters to a standard distribution, a wide range of rate distributions can be captured. Using simulated data, we demonstrate that a "beta-" model can reproduce the moments of the rate distribution more accurately than the distribution used to simulate the data, even when the baseline rate distribution is misspecified. Using hepatitis C virus and mammalian mitochondrial sequences, we show that a beta- model can fit as well or better than a model with multiple discrete rate categories, and compares favorably with a model which fits a separate rate category to each site. We also demonstrate this discretization scheme in the context of codon models specifically aimed at identifying individual sites undergoing adaptive or purifying evolution. 相似文献
9.
Soo Hyung Eo J. Andrew DeWoody 《Proceedings. Biological sciences / The Royal Society》2010,277(1700):3587-3592
Rates of biological diversification should ultimately correspond to rates of genome evolution. Recent studies have compared diversification rates with phylogenetic branch lengths, but incomplete phylogenies hamper such analyses for many taxa. Herein, we use pairwise comparisons of confamilial sauropsid (bird and reptile) mitochondrial DNA (mtDNA) genome sequences to estimate substitution rates. These molecular evolutionary rates are considered in light of the age and species richness of each taxonomic family, using a random-walk speciation–extinction process to estimate rates of diversification. We find the molecular clock ticks at disparate rates in different families and at different genes. For example, evolutionary rates are relatively fast in snakes and lizards, intermediate in crocodilians and slow in turtles and birds. There was also rate variation across genes, where non-synonymous substitution rates were fastest at ATP8 and slowest at CO3. Family-by-gene interactions were significant, indicating that local clocks vary substantially among sauropsids. Most importantly, we find evidence that mitochondrial genome evolutionary rates are positively correlated with speciation rates and with contemporary species richness. Nuclear sequences are poorly represented among reptiles, but the correlation between rates of molecular evolution and species diversification also extends to 18 avian nuclear genes we tested. Thus, the nuclear data buttress our mtDNA findings. 相似文献
10.
We develop a new model for studying the molecular evolution of protein-coding DNA sequences. In contrast to existing models, we incorporate the potential for site-to-site heterogeneity of both synonymous and nonsynonymous substitution rates. We demonstrate that within-gene heterogeneity of synonymous substitution rates appears to be common. Using the new family of models, we investigate the utility of a variety of new statistical inference procedures, and we pay particular attention to issues surrounding the detection of sites undergoing positive selection. We discuss how failure to model synonymous rate variation in the model can lead to misidentification of sites as positively selected. 相似文献
11.
Mark Stoneking 《Evolutionary anthropology》1993,2(2):60-73
The study of recent human evolution, or the origin of modern humans, is currently dominated by two theories. The recent African origin hypothesis holds that there was a single origin of modern humans in Africa about 100,000 years ago, after which these humans dispersed throughout the rest of the world, mixing little or not at all with nonmodern populations. The multiregional evolution hypothesis holds that there was no single origin of modern humans but, instead, that the mutations and other traits that led to modern humans were spread in concert throughout the old world by gene flow, leading to genetic continuity among old world populations during the past million years. Although both of these theories are based on observations stemming from the fossil record, much discussion and controversy during the past six years has focused on the application and interpretation of studies of DNA variation, particularly mitochondrial DNA (mtDNA). The past year, especially, has brought new data, interpretations, and controversies. Indeed, I initially resisted writing this review, on the grounds that new information would be likely to render it obsolete by the time it was published. However, now that the dust is starting to settle, it seems timely to review various investigations and interpretations and where they are likely to lead. While the focus of this review is the mtDNA story, brief mention is made of studies of nuclear DNA variation (both autosomal and Y-chromosome DNA) and the implications of the genetic data with regard to the fossil record and our understanding of recent human evolution. 相似文献
12.
IQPNNI: moving fast through tree space and stopping in time 总被引:12,自引:0,他引:12
An efficient tree reconstruction method (IQPNNI) is introduced to reconstruct a phylogenetic tree based on DNA or amino acid sequence data. Our approach combines various fast algorithms to generate a list of potential candidate trees. The key ingredient is the definition of so-called important quartets (IQs), which allow the computation of an intermediate tree in O(n(2)) time for n sequences. The resulting tree is then further optimized by applying the nearest neighbor interchange (NNI) operation. Subsequently a random fraction of the sequences is deleted from the best tree found so far. The deleted sequences are then re-inserted in the smaller tree using the important quartet puzzling (IQP) algorithm. These steps are repeated several times and the best tree, with respect to the likelihood criterion, is considered as the inferred phylogenetic tree. Moreover, we suggest a rule which indicates when to stop the search. Simulations show that IQPNNI gives a slightly better accuracy than other programs tested. Moreover, we applied the approach to 218 small subunit rRNA sequences and 500 rbcL sequences. We found trees with higher likelihood compared to the results by others. A program to reconstruct DNA or amino acid based phylogenetic trees is available online (http://www.bi.uni-duesseldorf.de/software/iqpnni). 相似文献
13.
We derive an expectation maximization algorithm for maximum-likelihood training of substitution rate matrices from multiple sequence alignments. The algorithm can be used to train hidden substitution models, where the structural context of a residue is treated as a hidden variable that can evolve over time. We used the algorithm to train hidden substitution matrices on protein alignments in the Pfam database. Measuring the accuracy of multiple alignment algorithms with reference to BAliBASE (a database of structural reference alignments) our substitution matrices consistently outperform the PAM series, with the improvement steadily increasing as up to four hidden site classes are added. We discuss several applications of this algorithm in bioinformatics. 相似文献
14.
The Japanese quail (Coturnix japonica; JQ) is one of the domesticated fowl species of Japan. To provide DNA sequence information for examination of its phylogenetic position in the order Galliformes, the complete sequence of the JQ mitochondria was determined. Sequence analysis revealed that the JQ mitochondrial genome is a circular DNA of 16 697 basepairs (bp), which is smaller than the chicken mitochondrial DNA of 16 775 bp, but the genomic structure of JQ mitochondria was the same as that of the chicken. The sequence homologies of all mitochondrial genes including those for 12S and 16S ribosomal RNA (rRNA), between Japanese quail and chicken ranged from 78.0 to 89.9%. Because the sequences of NADH dehydrogenase subunit 2 and cytochrome b genes had been reported in five species [Phasianus colchicus (ring-neck pheasant: RP), Gallus gallus domesticus (chicken: CH), Perdix perdix (grey partridge: GP), Bambusicola thoracia (Chinese bamboo partridge: CP), and Aythya americana (redhead: RH)], the concatenated nucleotide sequences (2184 bp) and amino acid sequences of these two genes were used in a phylogenetic analysis of JQ against these five species using a maximum likelihood (ML) method. Using the first and second bases of the codons, and the third base of the codons indicated a phylogenic tree of [RH, (RP, GP), (JQ, (CH, CP))]. A phylogenic tree of [RH, JQ, (RP, GP), (CH, CP)] was determined using amino acid sequences. Because the local bootstrap values for the JQ branch in these trees are not high, additional sequence is necessary for construction of a reliable tree. 相似文献
15.
Piontkivska H 《Molecular phylogenetics and evolution》2004,31(3):865-873
Choice of a substitution model is a crucial step in the maximum likelihood (ML) method of phylogenetic inference, and investigators tend to prefer complex mathematical models to simple ones. However, when complex models with many parameters are used, the extent of noise in statistical inferences increases, and thus complex models may not produce the true topology with a higher probability than simple ones. This problem was studied using computer simulation. When the number of nucleotides used was relatively large (1000 bp), the HKY+Gamma model showed smaller d(T) topological distance between the inferred and the true trees) than the JC and Kimura models. In the cases of shorter sequences (300 bp) simpler model and search algorithm such as JC model and SA+NNI search were found to be as efficient as more complicated searches and models in terms of topological distances, although the topologies obtained under HKY+Gamma model had the highest likelihood values. The performance of relatively simple search algorithm SA+NNI was found to be essentially the same as that of more extensive SA+TBR search under all models studied. Similarly to the conclusions reached by Takahashi and Nei [Mol. Biol. Evol. 17 (2000) 1251], our results indicate that simple models can be as efficient as complex models, and that use of complex models does not necessarily give more reliable trees compared with simple models. 相似文献
16.
Summary A method of estimating the number of nucleotide substitutions from amino acid sequence data is developed by using Dayhoff's mutation probability matrix. This method takes into account the effect of nonrandom amino acid substitutions and gives an estimate which is similar to the value obtained by Fitch's counting method, but larger than the estimate obtained under the assumption of random substitutions (Jukes and Cantor's formula). Computer simulations based on Dayhoff's mutation probability matrix have suggested that Jukes and Holmquist's method of estimating the number of nucleotide substitutions gives an overestimate when amino acid substitution is not random and the variance of the estimate is generally very large. It is also shown that when the number of nucleotide substitutions is small, this method tends to give an overestimate even when amino acid substitution is purely at random. 相似文献
17.
E. C. Holmes 《Journal of molecular evolution》1991,33(3):209-215
Summary In an attempt to resolve some points of branching order in the phylogeny of the eutherian mammals, a phylogenetic analysis of 26 nuclear and 6 mitochondrial genes was undertaken using a maximum likelihood method on a constant rate stochastic model of molecular evolution. Seventeen of the nuclear genes gave a primates/artiodactyls grouping highest support whereas three of the mitochondrial genes found a rodents/artiodactyls grouping to be best supported. The primates/rodents grouping was never the best supported. On the assumption that rodents are indeed an outgroup to primates and artiodactyls and that the latter taxa diverged 70 million years ago, an estimation was made, for each gene, of the time of divergence of the rodent lineage. In most cases such estimates were beyond the limits set by present interpretations of the paleontological record as were many estimates of the divergence time of mouse and rat. These results suggest that, although there is locus variation, the divergent position of the rodent lineage may be an artifact of an elevated rate of nucleotide substitution in this order. 相似文献
18.
Maximum likelihood phylogeny reconstruction methods are widely used in uncovering and assessing the evolutionary history and relationships of natural systems. However, several simplifying assumptions commonly made in this analysis limit the explanatory power of the results obtained. We present an algorithm that performs the phylogenetic analysis without making the common assumptions for sequence data from at least three leaf nodes in a star phylogeny. In particular, the underlying nucleotide substitution model does not have to be reversible and may include neighbor-dependent processes like the CpG methylation deamination process (CpG-effect). The base composition of the sequences at the external nodes and the one of the ancestral sequence may be different from each other and they do not have to be stationary state distributions of the corresponding substitution model. The algorithm is able to reconstruct the ancestral base composition and accurately estimate substitution frequencies in the branches of the star phylogeny. Extensive tests on simulated data validate the very favorable performance of the algorithm. As an application we present the analysis of aligned genomic sequences from human, mouse, and dog. Different substitution pattern can be observed in the three lineages. 相似文献
19.
Using the ratio of nonsynonymous to synonymous nucleotide substitution rates (Ka/Ks) is a common approach for detecting positive selection. However, calculation of this ratio over a whole gene combines amino acid sites that may be under positive selection with those that are highly conserved. We introduce a new covarion‐based method to sample only the sites potentially under selective pressure. Using ancestral sequence reconstruction over a phylogenetic tree coupled with calculation of Ka/Ks ratios, positive selection is better detected by this simple covarion‐based approach than it is using a whole gene analysis or a windowing analysis. This is demonstrated on a synthetic dataset and is tested on primate leptin, which indicates a previously undetected round of positive selection in the branch leading to Gorilla gorilla. 相似文献
20.
Karro JE Peifer M Hardison RC Kollmann M von Grünberg HH 《Molecular biology and evolution》2008,25(2):362-374
The distribution of guanine and cytosine nucleotides throughout a genome, or the GC content, is associated with numerous features in mammals; understanding the pattern and evolutionary history of GC content is crucial to our efforts to annotate the genome. The local GC content is decaying toward an equilibrium point, but the causes and rates of this decay, as well as the value of the equilibrium point, remain topics of debate. By comparing the results of 2 methods for estimating local substitution rates, we identify 620 Mb of the human genome in which the rates of the various types of nucleotide substitutions are the same on both strands. These strand-symmetric regions show an exponential decay of local GC content at a pace determined by local substitution rates. DNA segments subjected to higher rates experience disproportionately accelerated decay and are AT rich, whereas segments subjected to lower rates decay more slowly and are GC rich. Although we are unable to draw any conclusions about causal factors, the results support the hypothesis proposed by Khelifi A, Meunier J, Duret L, and Mouchiroud D (2006. GC content evolution of the human and mouse genomes: insights from the study of processed pseudogenes in regions of different recombination rates. J Mol Evol. 62:745-752.) that the isochore structure has been reshaped over time. If rate variation were a determining factor, then the current isochore structure of mammalian genomes could result from the local differences in substitution rates. We predict that under current conditions strand-symmetric portions of the human genome will stabilize at an average GC content of 30% (considerably less than the current 42%), thus confirming that the human genome has not yet reached equilibrium. 相似文献