首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
How are model protein structures distributed in sequence space?   总被引:6,自引:0,他引:6       下载免费PDF全文
The figure-to-structure maps for all uniquely folding sequences of short hydrophobic polar (HP) model proteins on a square lattice is analyzed to investigate aspects considered relevant to evolution. By ranking structures by their frequencies, few very frequent and many rare structures are found. The distribution can be empirically described by a generalized Zipf's law. All structures are relatively compact, yet the most compact ones are rare. Most sequences falling to the same structure belong to "neutral nets." These graphs in sequence space are connected by point mutations and centered around prototype sequences, which tolerate the largest number (up to 55%) of neutral mutations. Profiles have been derived from these homologous sequences. Frequent structures conserve hydrophobic cores only while rare ones are sensitive to surface mutations as well. Shape space covering, i.e., the ability to transform any structure into most others with few point mutations, is very unlikely. It is concluded that many characteristic features of the sequence-to-structure map of real proteins, such as the dominance of few folds, can be explained by the simple HP model. In analogy to protein families, nets are dense and well separated in sequence space. Potential implications in better understanding the evolution of proteins and applications to improving database searches are discussed.  相似文献   

2.
The evolution and adaptation of molecular populations is constrained by the diversity accessible through mutational processes. RNA is a paradigmatic example of biopolymer where genotype (sequence) and phenotype (approximated by the secondary structure fold) are identified in a single molecule. The extreme redundancy of the genotype-phenotype map leads to large ensembles of RNA sequences that fold into the same secondary structure and can be connected through single-point mutations. These ensembles define neutral networks of phenotypes in sequence space. Here we analyze the topological properties of neutral networks formed by 12-nucleotides RNA sequences, obtained through the exhaustive folding of sequence space. A total of 4(12) sequences fragments into 645 subnetworks that correspond to 57 different secondary structures. The topological analysis reveals that each subnetwork is far from being random: it has a degree distribution with a well-defined average and a small dispersion, a high clustering coefficient, and an average shortest path between nodes close to its minimum possible value, i.e. the Hamming distance between sequences. RNA neutral networks are assortative due to the correlation in the composition of neighboring sequences, a feature that together with the symmetries inherent to the folding process explains the existence of communities. Several topological relationships can be analytically derived attending to structural restrictions and generic properties of the folding process. The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations. This property prevents fragmentation of neutral networks and thus enhances the navigability of sequence space. In summary, RNA neutral networks show unique topological properties, unknown to other networks previously described.  相似文献   

3.
We stimulate the evolution of model protein sequences subject to mutations. A mutation is considered neutral if it conserves (1) the structure of the ground state, (2) its thermodynamic stability and (3) its kinetic accessibility. All other mutations are considered lethal and are rejected. We adopt a lattice model, amenable to a reliable solution of the protein folding problem. We prove the existence of extended neutral networks in sequence space-sequences can evolve until their similarity with the starting point is almost the same as for random sequences. Furthermore, we find that the rate of neutral mutations has a broad distribution in sequence space. Due to this fact, the substitution process is overdispersed (the ratio between variance and mean is larger than 1). This result is in contrast with the simplest model of neutral evolution, which assumes a Poisson process for substitutions, and in qualitative agreement with the biological data.  相似文献   

4.
In RNA fitness landscapes with interconnected networks of neutral mutations, neutral precursor mutations can play an important role in facilitating the accessibility of epistatic adaptive mutant combinations. I use an exhaustively surveyed fitness landscape model based on short sequence RNA genotypes (and their secondary structure phenotypes) to calculate the minimum rate at which mutants initially appearing as neutral are incorporated into an adaptive evolutionary walk. I show first, that incorporating neutral mutations significantly increases the number of point mutations in a given evolutionary walk when compared to estimates from previous adaptive walk models. Second, that incorporating neutral mutants into such a walk significantly increases the final fitness encountered on that walk - indeed evolutionary walks including neutral steps often reach the global optimum in this model. Third, and perhaps most importantly, evolutionary paths of this kind are often extremely winding in their nature and have the potential to undergo multiple mutations at a given sequence position within a single walk; the potential of these winding paths to mislead phylogenetic reconstruction is briefly considered.  相似文献   

5.

Background

The neutral theory of Motoo Kimura stipulates that evolution is mostly driven by neutral mutations. However adaptive pressure eventually leads to changes in phenotype that involve non-neutral mutations. The relation between neutrality and adaptation has been studied in the context of RNA before and here we further study transitional mutations in the context of degenerate (plastic) RNA sequences and genetic assimilation. We propose quasineutral mutations, i.e. mutations which preserve an element of the phenotype set, as minimal mutations and study their properties. We also propose a general probabilistic interpretation of genetic assimilation and specialize it to the Boltzmann ensemble of RNA sequences.

Results

We show that degenerate sequences i.e. sequences with more than one structure at the MFE level have the highest evolvability among all sequences and are central to evolutionary innovation. Degenerate sequences also tend to cluster together in the sequence space. The selective pressure in an evolutionary simulation causes the population to move towards regions with more degenerate sequences, i.e. regions at the intersection of different neutral networks, and this causes the number of such sequences to increase well beyond the average percentage of degenerate sequences in the sequence space. We also observe that evolution by quasineutral mutations tends to conserve the number of base pairs in structures and thereby maintains structural integrity even in the presence of pressure to the contrary.

Conclusions

We conclude that degenerate RNA sequences play a major role in evolutionary adaptation.
  相似文献   

6.
Watson J  Geard N  Wiles J 《Bio Systems》2004,76(1-3):239-248
Genetic regulation is often viewed as a complex system whose properties emerge from the interaction of regulatory genes. One major paradigm for studying the complex dynamics of gene regulation uses directed graphs to explore structure, behaviour and evolvability. Mutation operators used in such studies typically involve the insertion and deletion of nodes, and the insertion, deletion and rewiring of links at the network level. These network-level mutational operators are sufficient to allow the statistical analysis of network structure, but impose limitations on the way networks are evolved. There are a wide variety of mutations in DNA sequences that have yet to be analysed for their network-level effects. By modelling an artificial genome at the level of nucleotide sequences and mapping it to a regulatory network, biologically grounded mutation operators can be mapped to network-level mutations. This paper analyses five such sequence level mutations (single-point mutation, transposition, inversion, deletion and gene duplication) for their effects at the network level. Using analytic and simulation techniques, we show that it is rarely the case that nodes and links are cleanly added or deleted, with even the simplest point mutation causing a wide variety of network-level modifications. As expected, the vast majority of simple (single-point) mutations are neutral, resulting in a neutral plateau from which a range of functional behaviours can be reached. By analysing the effects of sequence-level mutations at the network level of gene regulation, we aim to stimulate more careful consideration of mutation operators in gene regulation models than has previously been given.  相似文献   

7.
RNA viruses and retroviruses fix substitutions approximately 1 million-fold faster than their hosts. This diversification could represent an inevitable drift under purifying selection, the majority of substitutions being phenotypically neutral. The alternative is to suppose that most fixed mutations are beneficial to the virus, allowing it to keep ahead of the host and/or host population. Here, relative sequence diversification of different proteins encoded by viral genomes is found to be linear. The examples encompass a wide variety of retroviruses and RNA viruses. The smoothness of relative divergence spans quasispeciation following clonal infection, to variation among different isolates of the same virus, to viruses from different species or those associated with different diseases, indicating that the majority of fixed mutations likely reflects drift. This held for both mammalian and plant viruses, indicating that adaptive immunity doesn't necessarily shape the relative accumulation of amino acid substitutions. When compared to their hosts RNA viruses evolution appears conservative. Received: 16 November 1999 / Accepted: 10 March 2000  相似文献   

8.
RNA secondary-structure folding algorithms predict the existence of connected networks of RNA sequences with identical secondary structures. Fitness landscapes that are based on the mapping between RNA sequence and RNA secondary structure hence have many neutral paths. A neutral walk on these fitness landscapes gives access to a virtually unlimited number of secondary structures that are a single point mutation from the neutral path. This shows that neutral evolution explores phenotype space and can play a role in adaptation. Received: 23 December 1995 / Accepted: 17 March 1996  相似文献   

9.
Robustness and evolvability are highly intertwined properties of biological systems. The relationship between these properties determines how biological systems are able to withstand mutations and show variation in response to them. Computational studies have explored the relationship between these two properties using neutral networks of RNA sequences (genotype) and their secondary structures (phenotype) as a model system. However, these studies have assumed every mutation to a sequence to be equally likely; the differences in the likelihood of the occurrence of various mutations, and the consequence of probabilistic nature of the mutations in such a system have previously been ignored. Associating probabilities to mutations essentially results in the weighting of genotype space. We here perform a comparative analysis of weighted and unweighted neutral networks of RNA sequences, and subsequently explore the relationship between robustness and evolvability. We show that assuming an equal likelihood for all mutations (as in an unweighted network), underestimates robustness and overestimates evolvability of a system. In spite of discarding this assumption, we observe that a negative correlation between sequence (genotype) robustness and sequence evolvability persists, and also that structure (phenotype) robustness promotes structure evolvability, as observed in earlier studies using unweighted networks. We also study the effects of base composition bias on robustness and evolvability. Particularly, we explore the association between robustness and evolvability in a sequence space that is AU-rich – sequences with an AU content of 80% or higher, compared to a normal (unbiased) sequence space. We find that evolvability of both sequences and structures in an AU-rich space is lesser compared to the normal space, and robustness higher. We also observe that AU-rich populations evolving on neutral networks of phenotypes, can access less phenotypic variation compared to normal populations evolving on neutral networks.  相似文献   

10.
Bloom JD  Raval A  Wilke CO 《Genetics》2007,175(1):255-266
Naturally evolving proteins gradually accumulate mutations while continuing to fold to stable structures. This process of neutral evolution is an important mode of genetic change and forms the basis for the molecular clock. We present a mathematical theory that predicts the number of accumulated mutations, the index of dispersion, and the distribution of stabilities in an evolving protein population from knowledge of the stability effects (delta deltaG values) for single mutations. Our theory quantitatively describes how neutral evolution leads to marginally stable proteins and provides formulas for calculating how fluctuations in stability can overdisperse the molecular clock. It also shows that the structural influences on the rate of sequence evolution observed in earlier simulations can be calculated using just the single-mutation delta deltaG values. We consider both the case when the product of the population size and mutation rate is small and the case when this product is large, and show that in the latter case the proteins evolve excess mutational robustness that is manifested by extra stability and an increase in the rate of sequence evolution. All our theoretical predictions are confirmed by simulations with lattice proteins. Our work provides a mathematical foundation for understanding how protein biophysics shapes the process of evolution.  相似文献   

11.
Viral evolution remains to be a main obstacle in the effectiveness of antiviral treatments. The ability to predict this evolution will help in the early detection of drug-resistant strains and will potentially facilitate the design of more efficient antiviral treatments. Various tools has been utilized in genome studies to achieve this goal. One of these tools is machine learning, which facilitates the study of structure-activity relationships, secondary and tertiary structure evolution prediction, and sequence error correction. This work proposes a novel machine learning technique for the prediction of the possible point mutations that appear on alignments of primary RNA sequence structure. It predicts the genotype of each nucleotide in the RNA sequence, and proves that a nucleotide in an RNA sequence changes based on the other nucleotides in the sequence. Neural networks technique is utilized in order to predict new strains, then a rough set theory based algorithm is introduced to extract these point mutation patterns. This algorithm is applied on a number of aligned RNA isolates time-series species of the Newcastle virus. Two different data sets from two sources are used in the validation of these techniques. The results show that the accuracy of this technique in predicting the nucleotides in the new generation is as high as 75 %. The mutation rules are visualized for the analysis of the correlation between different nucleotides in the same RNA sequence.  相似文献   

12.
13.
Most models of quasi-species evolution predict that populations will evolve to occupy areas of sequence space with the greatest concentration of neutral sequences, thus minimizing the deleterious mutation rate and creating mutationally 'robust' genomes. In contrast, empirical studies of the principal model of quasi-species evolution, RNA viruses, suggest that the effects of deleterious mutations are more severe than in similar DNA-based microbes. We demonstrate that populations divided into discrete patches connected by dispersal may favour genotypes where the deleterious effect of non-neutral mutations is maximized. This effect is especially strong in the absence of back mutation and when the amount of time spent in hosts prior to dispersal is intermediate. Our results indicate that RNA viruses that produce acute infections initiated by a small number of virions are expected to evolve fragile genetic architectures when compared with other RNA viruses.  相似文献   

14.
15.
Conservation of residue interactions in a family of Ca-binding proteins   总被引:1,自引:0,他引:1  
In the TNC family of Ca-binding proteins (calmodulin, parvalbumin, intestinal calcium binding protein and troponin C) approximately 70 well-conserved amino acid sequences and six crystal structures are known. We find a clear correlation between residue contacts in the structures and residue conservation in the sequences: residues with strong sidechain-sidechain contacts in the three-dimenesional structure tend to be the more conserved in the sequence. This is one way to quantify the intuitive notion of the importance of sidechain interactions for maintaining protein three-dimensional structure in evolution and may usefully be taken into account in planning point mutations in protein engineering.  相似文献   

16.
The quasispecies model of RNA virus evolution differs from those formulated in conventional population genetics in that neutral mutations do not lead to genetic drift of the population, and natural selection acts on the mutant distribution as a whole rather than on individual variants. By computer simulation, we show that this model could be inappropriate for many RNA viruses because the neutral sequence space may be too large to allow the formation of a quasispecies distribution. This view is supported by our analysis of gene sequences from vesicular stomatitis virus, which is considered a prototype RNA virus quasispecies. Our results are relevant to the evolution of RNA systems in general.  相似文献   

17.
Messenger RNA sequences often have to preserve functional secondary structure elements in addition to coding for proteins. We present a statistical analysis of retroviral mRNA which supports the hypothesis that the natural genetic code is adapted to such complementary coding. These sequences are still able to explore efficiently the space of possible proteins by point mutations. This is borne out by the observation that, in stem regions of retroviral mRNA foldings, silent mutations on one strand are preferentially accompanied by conservative mutations on the other. Distances between amino acids based on physicochemical properties are used to quantify the conservation of protein function under the constraint of maintained RNA secondary structure. We find that preservation of RNA secondary structure by compensatory mutations is evolutionary compatible with the efficient search for new variants on the protein level. Received: 4 June 1999 / Accepted: 12 October 1999  相似文献   

18.
Intrapatient evolution of human immunodeficiency virus type 1 (HIV-1) is driven by the adaptive immune system resulting in rapid change of HIV-1 proteins. When cytotoxic CD8+ T cells or neutralizing antibodies target a new epitope, the virus often escapes via nonsynonymous mutations that impair recognition. Synonymous mutations do not affect this interplay and are often assumed to be neutral. We test this assumption by tracking synonymous mutations in longitudinal intrapatient data from the C2-V5 part of the env gene. We find that most synonymous variants are lost even though they often reach high frequencies in the viral population, suggesting a cost to the virus. Using published data from SHAPE (selective 2′-hydroxyl acylation analyzed by primer extension) assays, we find that synonymous mutations that disrupt base pairs in RNA stems flanking the variable loops of gp120 are more likely to be lost than other synonymous changes: these RNA hairpins might be important for HIV-1. Computational modeling indicates that, to be consistent with the data, a large fraction of synonymous mutations in this genomic region need to be deleterious with a cost on the order of 0.002 per day. This weak selection against synonymous substitutions does not result in a strong pattern of conservation in cross-sectional data but slows down the rate of evolution considerably. Our findings are consistent with the notion that large-scale patterns of RNA structure are functionally relevant, whereas the precise base pairing pattern is not.  相似文献   

19.
E Ferrada  A Wagner 《Biophysical journal》2012,102(8):1916-1925
The relationship between the genotype (sequence) and the phenotype (structure) of macromolecules affects their ability to evolve new structures and functions. We here compare the genotype space organization of proteins and RNA molecules to identify differences that may affect this ability. To this end, we computationally study the genotype-phenotype relationship for short RNA and lattice proteins of a reduced monomer alphabet size, to make exhaustive analysis and direct comparison of their genotype spaces feasible. We find that many fewer protein molecules than RNA molecules fold, but they fold into many more structures than RNA. In consequence, protein phenotypes have smaller genotype networks whose member genotypes tend to be more similar than for RNA phenotypes. Neighborhoods in sequence space of a given radius around an RNA molecule contain more novel structures than for protein molecules. We compare this property to evidence from natural RNA and protein molecules, and conclude that RNA genotype space may be more conducive to the evolution of new structure phenotypes.  相似文献   

20.

Background  

RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(n m ) for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号