首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A central goal of computational biology is the prediction of phenotype from DNA and protein sequence data. Recent models of sequence change use in silico prediction systems to incorporate the effects of phenotype on evolutionary rates. These models have been designed for analyzing sequence data from different species and have been accompanied by statistical techniques for estimating model parameters when the incorporation of phenotype induces dependent change among sequence positions. A difficulty with these efforts to link phenotype and interspecific evolution is that evolution occurs within populations, and parameters of interspecific models should have population genetic interpretations. We show, with two examples, how population genetic interpretations can be assigned to evolutionary models. The first example considers the impact of RNA secondary structure on sequence change, and the second reflects the tendency for protein tertiary structure to influence nonsynonymous substitution rates. We argue that statistical fit to data should not be the sole criterion for assessing models of sequence change. A good interspecific model should also yield a clear and biologically plausible population genetic interpretation.  相似文献   

2.
Markovian models of protein evolution that relax the assumption of independent change among codons are considered. With this comparatively realistic framework, an evolutionary rate at a site can depend both on the state of the site and on the states of surrounding sites. By allowing a relatively general dependence structure among sites, models of evolution can reflect attributes of tertiary structure. To quantify the impact of protein structure on protein evolution, we analyze protein-coding DNA sequence pairs with an evolutionary model that incorporates effects of solvent accessibility and pairwise interactions among amino acid residues. By explicitly considering the relationship between nonsynonymous substitution rates and protein structure, this approach can lead to refined detection and characterization of positive selection. Analyses of simulated sequence pairs indicate that parameters in this evolutionary model can be well estimated. Analyses of lysozyme c and annexin V sequence pairs yield the biologically reasonable result that amino acid replacement rates are higher when the replacements lead to energetically favorable proteins than when they destabilize the proteins. Although the focus here is evolutionary dependence among codons that is associated with protein structure, the statistical approach is quite general and could be applied to diverse cases of evolutionary dependence where surrogates for sequence fitness can be measured or modeled.  相似文献   

3.
With its theoretical basis firmly established in molecular evolutionary and population genetics, the comparative DNA and protein sequence analysis plays a central role in reconstructing the evolutionary histories of species and multigene families, estimating rates of molecular evolution, and inferring the nature and extent of selective forces shaping the evolution of genes and genomes. The scope of these investigations has now expanded greatly owing to the development of high-throughput sequencing techniques and novel statistical and computational methods. These methods require easy-to-use computer programs. One such effort has been to produce Molecular Evolutionary Genetics Analysis (MEGA) software, with its focus on facilitating the exploration and analysis of the DNA and protein sequence variation from an evolutionary perspective. Currently in its third major release, MEGA3 contains facilities for automatic and manual sequence alignment, web-based mining of databases, inference of the phylogenetic trees, estimation of evolutionary distances and testing evolutionary hypotheses. This paper provides an overview of the statistical methods, computational tools, and visual exploration modules for data input and the results obtainable in MEGA.  相似文献   

4.
Evidence is mounting that mutation rates are sufficiently high for deleterious alleles to be a major evolutionary force affecting the evolution of sex, the maintenance of genetic variation, and many other evolutionary phenomena. Though point estimates of mutation rates are improving, we remain largely ignorant of the biological factors affecting these rates at the individual level. Of special importance is the possibility that mutation rates are condition-dependent with low-condition individuals experiencing more mutation. Theory predicts that such condition dependence would dramatically increase the rate at which populations adapt to new environments and the extent to which populations suffer from mutation load. Despite its importance, there has been little study of this phenomenon in multicellular organisms. Here, we examine whether DNA repair processes are condition-dependent in Drosophila melanogaster. In this species, damaged DNA in sperm can be repaired by maternal repair processes after fertilization. We exposed high- and low-condition females to sperm containing damaged DNA and then assessed the frequency of lethal mutations on paternally derived X chromosomes transmitted by these females. The rate of lethal mutations transmitted by low-condition females was 30% greater than that of high-condition females, indicating reduced repair capacity of low-condition females. A separate experiment provided no support for an alternative hypothesis based on sperm selection.  相似文献   

5.
Microsatellite DNA sequences mutate at rates several orders of magnitude higher than that of the bulk of DNA. Such high rates mean that spontaneous mutations that form new-length variants can realistically be seen in pedigree analysis. Data on observed mutation events from various organisms are now accumulating, allowing inferences on DNA sequence evolution to be made through an unusually direct approach. Here I discuss and integrate microsatellite mutation data in an evolutionary context. A striking feature of the mutation process is that it seems highly heterogeneous, with distinct differences between species, repeat types, loci and alleles. Age and sex also affect the mutation rate. Within genomes at equilibrium, the microsatellite-length distribution is a delicate balance between biased mutation processes and point mutations acting towards the decay of repetitive DNA. Indeed, simple repeats do not evolve simply.  相似文献   

6.
Although probabilistic models of genotype (e.g., DNA sequence) evolution have been greatly elaborated, less attention has been paid to the effect of phenotype on the evolution of the genotype. Here we propose an evolutionary model and a Bayesian inference procedure that are aimed at filling this gap. In the model, RNA secondary structure links genotype and phenotype by treating the approximate free energy of a sequence folded into a secondary structure as a surrogate for fitness. The underlying idea is that a nucleotide substitution resulting in a more stable secondary structure should have a higher rate than a substitution that yields a less stable secondary structure. This free energy approach incorporates evolutionary dependencies among sequence positions beyond those that are reflected simply by jointly modeling change at paired positions in an RNA helix. Although there is not a formal requirement with this approach that secondary structure be known and nearly invariant over evolutionary time, computational considerations make these assumptions attractive and they have been adopted in a software program that permits statistical analysis of multiple homologous sequences that are related via a known phylogenetic tree topology. Analyses of 5S ribosomal RNA sequences are presented to illustrate and quantify the strong impact that RNA secondary structure has on substitution rates. Analyses on simulated sequences show that the new inference procedure has reasonable statistical properties. Potential applications of this procedure, including improved ancestral sequence inference and location of functionally interesting sites, are discussed.  相似文献   

7.
MOTIVATION: Microbial genomes undergo evolutionary processes such as gene family expansion and contraction, variable rates and patterns of sequence substitution and lateral genetic transfer. Simulation tools are essential for both the generation of data under different evolutionary models and the validation of analytical methods on such data. However, meaningful investigation of phenomena such as lateral genetic transfer requires the simultaneous consideration of many underlying evolutionary processes. RESULTS: We have developed EvolSimulator, a software package that combines non-stationary sequence and gene family evolution together with models of lateral genetic transfer, within a customizable birth-death model of speciation and extinction. Here, we examine simulated data sets generated with EvolSimulator using existing statistical techniques from the evolutionary literature, showing in detail each component of the simulation strategy. AVAILABILITY: Source code, manual and other information are freely available at www.bioinformatics.org.au/evolsim. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

8.
9.
M Kimura 《Génome》1989,31(1):24-31
The main tenet of the neutral theory is that the great majority of evolutionary changes at the molecular level are caused not by Darwinian selection but by random fixation of selectively neutral (or very nearly neutral) alleles through random sampling drift under continued mutation pressure. The theory also asserts that the majority of protein and DNA polymorphisms are selectively neutral, and that they are maintained in the species by mutational input balanced by random extinction rather than by "balancing selection." The neutral theory is based on simple assumptions. This enabled us to develop mathematical theories (using the diffusion equation method) that can treat these phenomena in quantitative terms and that permit theory to be tested against actual observations. Although the neutral theory has been severely criticized by the neo-Darwinian establishment, supporting evidence has accumulated over the last 20 years. In particular, the recent burst of DNA sequence data helped to strengthen the theory a great deal. I believe that the neutral theory triggered reexamination of the traditional "synthetic theory of evolution." In this paper, I review the present status of the neutral theory, including discussions of such topics as "molecular evolutionary clock," very high evolutionary rates observed in RNA viruses, a deviant coding system found in Mycoplasm together with the concept of mutation-driven neutral evolution, and the origin of life. I also present a worldview based on the conception of what I call "survival of the luckiest."  相似文献   

10.
We present a stochastic sequence evolution model to obtain alignments and estimate mutation rates between two homologous sequences. The model allows two possible evolutionary behaviors along a DNA sequence in order to determine conserved regions and take its heterogeneity into account. In our model, the sequence is divided into slow and fast evolution regions. The boundaries between these sections are not known. It is our aim to detect them. The evolution model is based on a fragment insertion and deletion process working on fast regions only and on a substitution process working on fast and slow regions with different rates. This model induces a pair hidden Markov structure at the level of alignments, thus making efficient statistical alignment algorithms possible. We propose two complementary estimation methods, namely, a Gibbs sampler for Bayesian estimation and a stochastic version of the EM algorithm for maximum likelihood estimation. Both algorithms involve the sampling of alignments. We propose a partial alignment sampler, which is computationally less expensive than the typical whole alignment sampler. We show the convergence of the two estimation algorithms when used with this partial sampler. Our algorithms provide consistent estimates for the mutation rates and plausible alignments and sequence segmentations on both simulated and real data.  相似文献   

11.
The evolutionary potential of a gene is constrained not only by the amino acid sequence of its product, but by its DNA sequence as well. The topology of the genetic code is such that half of the amino acids exhibit synonymous codons that can reach different subsets of amino acids from each other through single mutation. Thus, synonymous DNA sequences should access different regions of the protein sequence space through a limited number of mutations, and this may deeply influence the evolution of natural proteins. Here, we demonstrate that this feature can be of value for manipulating protein evolvability. We designed an algorithm that, starting from an input gene, constructs a synonymous sequence that systematically includes the codons with the most different evolutionary perspectives; i.e., codons that maximize accessibility to amino acids previously unreachable from the template by point mutation. A synonymous version of a bacterial antibiotic resistance gene was computed and synthesized. When concurrently submitted to identical directed evolution protocols, both the wild type and the recoded sequence led to the isolation of specific, advantageous phenotypic variants. Simulations based on a mutation isolated only from the synthetic gene libraries were conducted to assess the impact of sub-functional selective constraints, such as codon usage, on natural adaptation. Our data demonstrate that rational design of synonymous synthetic genes stands as an affordable improvement to any directed evolution protocol. We show that using two synonymous DNA sequences improves the overall yield of the procedure by increasing the diversity of mutants generated. These results provide conclusive evidence that synonymous coding sequences do experience different areas of the corresponding protein adaptive landscape, and that a sequence''s codon usage effectively constrains the evolution of the encoded protein.  相似文献   

12.
The Molecular Evolutionary Genetics Analysis (MEGA) software is a desktop application designed for comparative analysis of homologous gene sequences either from multigene families or from different species with a special emphasis on inferring evolutionary relationships and patterns of DNA and protein evolution. In addition to the tools for statistical analysis of data, MEGA provides many convenient facilities for the assembly of sequence data sets from files or web-based repositories, and it includes tools for visual presentation of the results obtained in the form of interactive phylogenetic trees and evolutionary distance matrices. Here we discuss the motivation, design principles and priorities that have shaped the development of MEGA. We also discuss how MEGA might evolve in the future to assist researchers in their growing need to analyze large data set using new computational methods.  相似文献   

13.
An evolutionary model for maximum likelihood alignment of DNA sequences   总被引:16,自引:0,他引:16  
Summary Most algorithms for the alignment of biological sequences are not derived from an evolutionary model. Consequently, these alignment algorithms lack a strong statistical basis. A maximum likelihood method for the alignment of two DNA sequences is presented. This method is based upon a statistical model of DNA sequence evolution for which we have obtained explicit transition probabilities. The evolutionary model can also be used as the basis of procedures that estimate the evolutionary parameters relevant to a pair of unaligned DNA sequences. A parameter-estimation approach which takes into account all possible alignments between two sequences is introduced; the danger of estimating evolutionary parameters from a single alignment is discussed.  相似文献   

14.
Evolution at high mutation rates is minimally affected by six processes: mutation-selection balance, error catastrophes, Muller's Ratchet, robustness and compensatory evolution, and clonal interference. Including all of these processes in a tractable, analytical model is difficult, but they can be captured in simulations that utilize realistic genotype-phenotype-fitness maps, as done here by modeling RNA folding. Subjecting finite, asexual populations to a range of mutation rates revealed simple criteria that predict when particular evolutionary processes are important. Populations were initiated with a genotype encoding the most fit phenotype. When purifying selection was strong relative to mutation, the initial genotype was replaced by one more mutationally robust, and the maximally fit phenotype was maintained in a mutation-selection balance where the deleterious mutation rate determined mean fitness. With weaker purifying selection, the most fit genotypes were lost. Although loss of the best genotype was ongoing and might have led to a progressive fitness decline, continual compensatory evolution led to an approximate fitness equilibration. Per total genomic mutation rate, mean fitness was similar for strong and weak purifying selection. These results represent a first step at separating interactions between evolutionary processes at high mutation rate, but additional theory is needed to interpret some outcomes.  相似文献   

15.
Within-patient HIV populations evolve rapidly because of a high mutation rate, short generation time, and strong positive selection pressures. Previous studies have identified "consistent patterns" of viral sequence evolution. Just before HIV infection progresses to AIDS, evolution seems to slow markedly, and the genetic diversity of the viral population drops. This evolutionary slowdown could be caused either by a reduction in the average viral replication rate or because selection pressures weaken with the collapse of the immune system. The former hypothesis (which we denote "cellular exhaustion") predicts a simultaneous reduction in both synonymous and nonsynonymous evolution, whereas the latter hypothesis (denoted "immune relaxation") predicts that only nonsynonymous evolution will slow. In this paper, we present a set of statistical procedures for distinguishing between these alternative hypotheses using DNA sequences sampled over the course of infection. The first component is a new method for estimating evolutionary rates that takes advantage of the temporal information in longitudinal DNA sequence samples. Second, we develop a set of probability models for the analysis of evolutionary rates in HIV populations in vivo. Application of these models to both synonymous and nonsynonymous evolution affords a comparison of the cellular-exhaustion and immune-relaxation hypotheses. We apply the procedures to longitudinal data sets in which sequences of the env gene were sampled over the entire course of infection. Our analyses (1) statistically confirm that an evolutionary slowdown occurs late in infection, (2) strongly support the immune-relaxation hypothesis, and (3) indicate that the cessation of nonsynonymous evolution is associated with disease progression.  相似文献   

16.
Molecular evolutionary clock and the neutral theory   总被引:6,自引:0,他引:6  
Summary From the standpoint of the neutral theory of molecular evolution, it is expected that a universally valid and exact molecular evolutionary clock would exist if, for a given molecule, the mutation rate for neutral allelesper year were exactly equal among all organisms at all times. Any deviation from the equality of neutral mutation rate per year makes the molecular clock less exact. Such deviation may be due to two causes: one is the change of the mutation rate per year (such as due to change of generation span), and the other is the alteration of the selective constraint of each molecule (due to change of internal molecular environment). A statistical method was developed to investigate the equality of evolutionary rates among lineages. This was used to analyze protein data to demonstrate that these two causes are actually at work in molecular evolution. It was emphasized that departures from exact clockwise progression of molecular evolution by no means invalidates the neutral theory. It was pointed out that experimental studies should be done to settle the issue of whether the mutation rate for nucleotide change is more constant per year or per generation among organisms whose generation spans are very different.  相似文献   

17.
A low rate of simultaneous double-nucleotide mutations in primates   总被引:1,自引:0,他引:1  
The occurrence of double-nucleotide (doublet) mutations is contrary to the normal assumption that point mutations affect single nucleotides. Here we develop a new method for estimating the doublet mutation rate and apply it to more than a megabase of human-chimpanzee-baboon genomic DNA alignments and more than a million human single-nucleotide polymorphisms. The new method accounts for the effect of regional variation in evolutionary rates, which may be a confounding factor in previous estimates of the doublet mutation rate. Furthermore we determine sequence context effects by using sequence comparisons over a variety of lineage lengths. This approach yields a new estimate of the doublet mutation rate of 0.3% of the singleton rate, indicating that doublet mutations are far rarer than previously thought. Our results suggest that doublet mutations are unlikely to have caused the correlation between synonymous and nonsynonymous substitution rates in mammals, and also show that regional variation and sequence context effects play an important role in primate DNA sequence evolution.  相似文献   

18.
Natural selection processes tune genomes in the edge of the chaos imposed by mutation and drift, allowing an enduring exploration of fitter genetic networks within the constraints imposed by self-organization and the interactions of genotype and phenotype. Alternatively, evolution can be viewed from thermodynamic, kinetic or cybernetic perspectives. Regardless of insight, there is need to understand structure-function relationships at the molecular and holistic evolutionary levels. Strategies are here described that analyze genetic variation in time and trace the evolution of nucleic acid structure. Nucleic acid scanning techniques were used to measure sequence divergence and provide a direct inference of genome-wide mutation rate. This was tested for the first time in vegetatively propagating plants. The method is general and was also used in a study of mutational patterns in phytopathogenic fungi, showing there was a link between sequence and structural diversification of ribosomal gene spacers. In order to determine if this was a general phenomenon, the origin and diversification of nucleic acid secondary structure was traced using a cladistic method capable of producing rooted phylogenetic trees. Phylogenies reconstructed from primary and secondary RNA structure were congruent at all taxonomical levels, providing evidence of a strong link between phenotype and genotype favoring thermodynamic stability and dissipation of Gibbs free energy. Overall results suggest that thermodynamic principles are important driving forces of the evolutionary processes of the living world.  相似文献   

19.
Exobiology, the study of the origin, evolution and distribution of life (including life on earth) within the context of cosmic evolution, is being given a remarkable boost by genome sequencing projects, which are now making the evolutionary histories of protein families routinely available. These histories comprise a multiple alignment for their protein sequences and the corresponding DNA sequences, an evolutionary tree showing the pedigree of these sequences, and reconstructed ancestral sequences for each node in the tree. In a post-genomic world having genomic sequences from an unlimited number of organisms, these histories will be used to connect structure, chemical reactivity, and physiological function to these families. This paper describes several “post-genomic” tools that exploit these evolutionary histories. They can be used to confirm or deny long distance homology between two protein families, identify proteins within a family that have new functions, and identify specific in vitro properties of the protein that are important for its physiological role. Evolution-based data structures for organizing large sequence databases are also described.  相似文献   

20.
Variation and change in mitochondrial DNA (mtDNA) is often assumed to conform to a constant mutation rate equilibrium neutral model of molecular evolution. Recent evidence, however, indicates that the assumptions underlying this model are frequently violated. The mitochondria) genome may be subject to the same suite of forces known to be acting in the nuclear genome, including hitchhiking and selection, as well as forces that do not affect nuclear variation. Wherever possible, evolutionary studies involving mtDNA should incorporate statistical tests to investigate the forces shaping sequence variation and evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号