首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sumedha  Martin OC  Wagner A 《Bio Systems》2007,90(2):475-485
RNA secondary structure is an important computational model to understand how genetic variation maps into phenotypic (structural) variation. Evolutionary innovation in RNA structures is facilitated by neutral networks, large connected sets of RNA sequences that fold into the same structure. Our work extends and deepens previous studies on neutral networks. First, we show that even the 1-mutant neighborhood of a given sequence (genotype) G0 with structure (phenotype) P contains many structural variants that are not close to P. This holds for biological and generic RNA sequences alike. Second, we analyze the relation between new structures in the 1-neighborhoods of genotypes Gk that are only a moderate Hamming distance k away from G0, and the structure of G0 itself, both for biological and for generic RNA structures. Third, we analyze the relation between mutational robustness of a sequence and the distances of structural variants near this sequence. Our findings underscore the role of neutral networks in evolutionary innovation, and the role that high robustness can play in diminishing the potential for such innovation.  相似文献   

2.
The evolution and adaptation of molecular populations is constrained by the diversity accessible through mutational processes. RNA is a paradigmatic example of biopolymer where genotype (sequence) and phenotype (approximated by the secondary structure fold) are identified in a single molecule. The extreme redundancy of the genotype-phenotype map leads to large ensembles of RNA sequences that fold into the same secondary structure and can be connected through single-point mutations. These ensembles define neutral networks of phenotypes in sequence space. Here we analyze the topological properties of neutral networks formed by 12-nucleotides RNA sequences, obtained through the exhaustive folding of sequence space. A total of 4(12) sequences fragments into 645 subnetworks that correspond to 57 different secondary structures. The topological analysis reveals that each subnetwork is far from being random: it has a degree distribution with a well-defined average and a small dispersion, a high clustering coefficient, and an average shortest path between nodes close to its minimum possible value, i.e. the Hamming distance between sequences. RNA neutral networks are assortative due to the correlation in the composition of neighboring sequences, a feature that together with the symmetries inherent to the folding process explains the existence of communities. Several topological relationships can be analytically derived attending to structural restrictions and generic properties of the folding process. The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations. This property prevents fragmentation of neutral networks and thus enhances the navigability of sequence space. In summary, RNA neutral networks show unique topological properties, unknown to other networks previously described.  相似文献   

3.
Random graph theory is used to model and analyse the relationship between sequences and secondary structures of RNA molecules, which are understood as mappings from sequence space into shape space. These maps are non-invertible since there are always many orders of magnitude more sequences than structures. Sequences folding into identical structures formneutral networks. A neutral network is embedded in the set of sequences that arecompatible with the given structure. Networks are modeled as graphs and constructed by random choice of vertices from the space of compatible sequences. The theory characterizes neutral networks by the mean fraction of neutral neighbors (λ). The networks are connected and percolate sequence space if the fraction of neutral nearest neighbors exceeds a threshold value (λ>λ*). Below threshold (λ<λ*), the networks are partitioned into a largest “giant” component and several smaller components. Structure are classified as “common” or “rare” according to the sizes of their pre-images, i.e. according to the fractions of sequences folding into them. The neutral networks of any pair of two different common structures almost touch each other, and, as expressed by the conjecture ofshape space covering sequences folding into almost all common structures, can be found in a small ball of an arbitrary location in sequence space. The results from random graph theory are compared to data obtained by folding large samples of RNA sequences. Differences are explained in terms of specific features of RNA molecular structures. Deicated to professor Manfred Eigen  相似文献   

4.
RNA secondary-structure folding algorithms predict the existence of connected networks of RNA sequences with identical secondary structures. Fitness landscapes that are based on the mapping between RNA sequence and RNA secondary structure hence have many neutral paths. A neutral walk on these fitness landscapes gives access to a virtually unlimited number of secondary structures that are a single point mutation from the neutral path. This shows that neutral evolution explores phenotype space and can play a role in adaptation. Received: 23 December 1995 / Accepted: 17 March 1996  相似文献   

5.
Folding of RNA sequences into secondary structures is viewed as a map that assigns a uniquely defined base pairing pattern to every sequence. The mapping is non-invertible since many sequences fold into the same minimum free energy (secondary) structure or shape. The pre-images of this map, called neutral networks, are uniquely associated with the shapes and vice versa. Random graph theory is used to construct networks in sequence space which are suitable models for neutral networks. The theory of molecular quasispecies has been applied to replication and mutation on single-peak fitness landscapes. This concept is extended by considering evolution on degenerate multi-peak landscapes which originate from neutral networks by assuming that one particular shape is fitter than all the others. On such a single-shape landscape the superior fitness value is assigned to all sequences belonging to the master shape. All other shapes are lumped together and their fitness values are averaged in a way that is reminiscent of mean field theory. Replication and mutation on neutral networks are modeled by phenomenological rate equations as well as by a stochastic birth-and-death model. In analogy to the error threshold in sequence space the phenotypic error threshold separates two scenarios: (i) a stationary (fittest) master shape surrounded by closely related shapes and (ii) populations drifting through shape space by a diffusion-like process. The error classes of the quasispecies model are replaced by distance classes between the master shape and the other structures. Analytical results are derived for single-shape landscapes, in particular, simple expressions are obtained for the mean fraction of master shapes in a population and for phenotypic error thresholds. The analytical results are complemented by data obtained from computer simulation of the underlying stochastic processes. The predictions of the phenomenological approach on the single-shape landscape are very well reproduced by replication and mutation kinetics of tRNA(phe). Simulation of the stochastic process at a resolution of individual distance classes yields data which are in excellent agreement with the results derived from the birth-and-death model.  相似文献   

6.
This paper studies local connectivity of neutral networks of RNA secondary and pseudoknot structures. A neutral network denotes the set of RNA sequences that fold into a particular structure. It is called locally connected, if in the limit of long sequences, the distance of any two of its sequences scales with their distance in the n-cube. One main result of this paper is that is the threshold probability for local connectivity for neutral networks, considered as random subgraphs of n-cubes. Furthermore, we analyze local connectivity for finite sequence length and different alphabets. We show that it is closely related to the existence of specific paths within the neutral network. We put our theoretical results into context with folding algorithms into minimum-free energy RNA secondary and pseudoknot structures. Finally, we relate our structural findings with dynamics by discussing the role of local connectivity in the context of neutral evolution.  相似文献   

7.
Robustness and evolvability are highly intertwined properties of biological systems. The relationship between these properties determines how biological systems are able to withstand mutations and show variation in response to them. Computational studies have explored the relationship between these two properties using neutral networks of RNA sequences (genotype) and their secondary structures (phenotype) as a model system. However, these studies have assumed every mutation to a sequence to be equally likely; the differences in the likelihood of the occurrence of various mutations, and the consequence of probabilistic nature of the mutations in such a system have previously been ignored. Associating probabilities to mutations essentially results in the weighting of genotype space. We here perform a comparative analysis of weighted and unweighted neutral networks of RNA sequences, and subsequently explore the relationship between robustness and evolvability. We show that assuming an equal likelihood for all mutations (as in an unweighted network), underestimates robustness and overestimates evolvability of a system. In spite of discarding this assumption, we observe that a negative correlation between sequence (genotype) robustness and sequence evolvability persists, and also that structure (phenotype) robustness promotes structure evolvability, as observed in earlier studies using unweighted networks. We also study the effects of base composition bias on robustness and evolvability. Particularly, we explore the association between robustness and evolvability in a sequence space that is AU-rich – sequences with an AU content of 80% or higher, compared to a normal (unbiased) sequence space. We find that evolvability of both sequences and structures in an AU-rich space is lesser compared to the normal space, and robustness higher. We also observe that AU-rich populations evolving on neutral networks of phenotypes, can access less phenotypic variation compared to normal populations evolving on neutral networks.  相似文献   

8.

Background

Interactions between genes and their products give rise to complex circuits known as gene regulatory networks (GRN) that enable cells to process information and respond to external stimuli. Several important processes for life, depend of an accurate and context-specific regulation of gene expression, such as the cell cycle, which can be analyzed through its GRN, where deregulation can lead to cancer in animals or a directed regulation could be applied for biotechnological processes using yeast. An approach to study the robustness of GRN is through the neutral space. In this paper, we explore the neutral space of a Schizosaccharomyces pombe (fission yeast) cell cycle network through an evolution strategy to generate a neutral graph, composed of Boolean regulatory networks that share the same state sequences of the fission yeast cell cycle.

Results

Through simulations it was found that in the generated neutral graph, the functional networks that are not in the wildtype connected component have in general a Hamming distance more than 3 with the wildtype, and more than 10 between the other disconnected functional networks. Significant differences were found between the functional networks in the connected component of the wildtype network and the rest of the network, not only at a topological level, but also at the state space level, where significant differences in the distribution of the basin of attraction for the G1 fixed point was found for deterministic updating schemes.

Conclusions

In general, functional networks in the wildtype network connected component, can mutate up to no more than 3 times, then they reach a point of no return where the networks leave the connected component of the wildtype. The proposed method to construct a neutral graph is general and can be used to explore the neutral space of other biologically interesting networks, and also formulate new biological hypotheses studying the functional networks in the wildtype network connected component.  相似文献   

9.
How are model protein structures distributed in sequence space?   总被引:6,自引:0,他引:6       下载免费PDF全文
The figure-to-structure maps for all uniquely folding sequences of short hydrophobic polar (HP) model proteins on a square lattice is analyzed to investigate aspects considered relevant to evolution. By ranking structures by their frequencies, few very frequent and many rare structures are found. The distribution can be empirically described by a generalized Zipf's law. All structures are relatively compact, yet the most compact ones are rare. Most sequences falling to the same structure belong to "neutral nets." These graphs in sequence space are connected by point mutations and centered around prototype sequences, which tolerate the largest number (up to 55%) of neutral mutations. Profiles have been derived from these homologous sequences. Frequent structures conserve hydrophobic cores only while rare ones are sensitive to surface mutations as well. Shape space covering, i.e., the ability to transform any structure into most others with few point mutations, is very unlikely. It is concluded that many characteristic features of the sequence-to-structure map of real proteins, such as the dominance of few folds, can be explained by the simple HP model. In analogy to protein families, nets are dense and well separated in sequence space. Potential implications in better understanding the evolution of proteins and applications to improving database searches are discussed.  相似文献   

10.
Protein structures can be encoded into binary sequences (Gabarro-Arpa et al., Comput Chem 2000;24:693-698) these are used to define a Hamming distance in conformational space: the distance between two different molecular conformations is the number of different bits in their sequences. Each bit in the sequence arises from a partition of conformational space in two halves. Thus, the information encoded in the binary sequences is also used to characterize the regions of conformational space visited by the system. We apply this distance and their associated geometric structures to the clustering and analysis of conformations sampled during a 4-ns molecular dynamics simulation of the HIV-1 integrase catalytic core. The cluster analysis of the simulation shows a division of the trajectory into two segments of 2.6 and 1.4 ns length, which are qualitatively different: the data points to the fact that equilibration is only reached at the end of the first segment. The Hamming distance is compared also to the r.m.s. deviation measure. The analysis of the cases studied so far shows that under the same conditions the two measures behave quite differently, and that the Hamming distance appears to be more robust than the r.m.s. deviation.  相似文献   

11.
Molecular evolution may be considered as a walk in a multidimensional fitness landscape, where the fitness at each point is associated with features such as the function, stability, and survivability of these molecules. We present a simple model for the evolution of protein sequences on a landscape with a precisely defined fitness function. We use simple lattice models to represent protein structures, with the ability of a protein sequence to fold into the structure with lowest energy, quantified as the foldability, representing the fitness of the sequence. The foldability of the sequence is characterized based on the spin glass model of protein folding. We consider evolution as a walk in this foldability landscape and study the nature of the landscape and the resulting dynamics. Selective pressure is explicitly included in this model in the form of a minimum foldability requirement. We find that different native structures are not evenly distributed in interaction space, with similar structures and structures with similar optimal foldabilities clustered together. Evolving proteins marginally fulfill the selective criteria of foldability. As the selective pressure is increased, evolutionary trajectories become increasingly confined to “neutral networks,” where the sequence and the interactions can be significantly changed while a constant structure is maintained. © 1997 John Wiley & Sons, Inc. Biopoly 42: 427–438, 1997  相似文献   

12.
Evolutionary networks in the formatted protein sequence space.   总被引:4,自引:0,他引:4  
In our recent work, a new approach to establish sequence relatedness, by walking through the protein sequence space, was introduced. The sequence space is built from 20 amino acid long fragments of proteins from a very large collection of fully sequenced prokaryotic genomes. The fragments, points in the space, are connected, if they are closely related (high sequence identity). The connected fragments form variety of networks of sequence kinship. In this research the networks in the formatted sequence space and their topology are analyzed. For lower identity thresholds a huge network of complex structure is formed, involving up to 10% points of the space. When the threshold is increased, the major network splits into a set of smaller clusters with a wide diversity of sizes and topologies. Such "evolutionary networks" may serve as a powerful sequence annotation tool that allows one to reveal fine details in the evolutionary history of proteins.  相似文献   

13.
Knowledge-based potentials can be used to decide whether an amino acid sequence is likely to fold into a prescribed native protein structure. We use this idea to survey the sequence-structure relations in protein space. In particular, we test the following two propositions which were found to be important for efficient evolution: the sequences folding into a particular native fold form extensive neutral networks that percolate through sequence space. The neutral networks of any two native folds approach each other to within a few point mutations. Computer simulations using two very different potential functions, M. Sippl's PROSA pair potential and a neural network based potential, are used to verify these claims.  相似文献   

14.
15.
A computer model of evolutionary optimization   总被引:3,自引:0,他引:3  
Molecular evolution is viewed as a typical combinatorial optimization problem. We analyse a chemical reaction model which considers RNA replication including correct copying and point mutations together with hydrolytic degradation and the dilution flux of a flow reactor. The corresponding stochastic reaction network is implemented on a computer in order to investigate some basic features of evolutionary optimization dynamics. Characteristic features of real molecular systems are mimicked by folding binary sequences into unknotted two-dimensional structures. Selective values are derived from these molecular 'phenotypes' by an evaluation procedure which assigns numerical values to different elements of the secondary structure. The fitness function obtained thereby contains nontrivial long-range interactions which are typical for real systems. The fitness landscape also reveals quite involved and bizarre local topologies which we consider also representative of polynucleotide replication in actually occurring systems. Optimization operates on an ensemble of sequences via mutation and natural selection. The strategy observed in the simulation experiments is fairly general and resembles closely a heuristic widely applied in operations research areas. Despite the relative smallness of the system--we study 2000 molecules of chain length v = 70 in a typical simulation experiment--features typical for the evolution of real populations are observed as there are error thresholds for replication, evolutionary steps and quasistationary sequence distributions. The relative importance of selectively neutral or almost neutral variants is discussed quantitatively. Four characteristic ensemble properties, entropy of the distribution, ensemble correlation, mean Hamming distance and diversity of the population, are computed and checked for their sensitivity in recording major optimization events during the simulation.  相似文献   

16.
Baoqiang Cao  Ron Elber 《Proteins》2010,78(4):985-1003
We investigate small sequence adjustments (of one or a few amino acids) that induce large conformational transitions between distinct and stable folds of proteins. Such transitions are intriguing from evolutionary and protein‐design perspectives. They make it possible to search for ancient protein structures or to design protein switches that flip between folds and functions. A network of sequence flow between protein folds is computed for representative structures of the Protein Data Bank. The computed network is dense, on an average each structure is connected to tens of other folds. Proteins that attract sequences from a higher than expected number of neighboring folds are more likely to be enzymes and alpha/beta fold. The large number of connections between folds may reflect the need of enzymes to adjust their structures for alternative substrates. The network of the Cro family is discussed, and we speculate that capacity is an important factor (but not the only one) that determines protein evolution. The experimentally observed flip from all alpha to alpha + beta fold is examined by the network tools. A kinetic model for the transition of sequences between the folds (with only protein stability in mind) is proposed. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

17.
Sean Burke  Ron Elber 《Proteins》2012,80(2):463-470
Exhaustive enumeration of sequences and folds is conducted for a simple lattice model of conformations, sequences, and energies. Examination of all foldable sequences and their nearest connected neighbors (sequences that differ by no more than a point mutation) illustrates the following: (i) There exist unusually large number of sequences that fold into a few structures (super‐folds). The same observation was made experimentally and computationally using stochastic sampling and exhaustive enumeration of related models. (ii) There exist only a few large networks of connected sequences that are not restricted to one fold. These networks cover a significant fraction of fold spaces (super‐networks). (iii) There exist barriers in sequence space that prevent foldable sequences of the same structure to “connect” through a series of single point mutations (super‐barrier), even in the presence of the sequence connection between folds. While there is ample experimental evidence for the existence of super‐folds, evidence for a super‐network is just starting to emerge. The prediction of a sequence barrier is an intriguing characteristic of sequence space, suggesting that the overall sequence space may be disconnected. The implications and limitations of these observations for evolution of protein structures are discussed. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

18.
Experimental studies have shown that some proteins exist in two alternative native-state conformations. It has been proposed that such bi-stable proteins can potentially function as evolutionary bridges at the interface between two neutral networks of protein sequences that fold uniquely into the two different native conformations. Under adaptive conflict scenarios, bi-stable proteins may be of particular advantage if they simultaneously provide two beneficial biological functions. However, computational models that simulate protein structure evolution do not yet recognize the importance of bi-stability. Here we use a biophysical model to analyze sequence space to identify bi-stable or multi-stable proteins with two or more equally stable native-state structures. The inclusion of such proteins enhances phenotype connectivity between neutral networks in sequence space. Consideration of the sequence space neighborhood of bridge proteins revealed that bi-stability decreases gradually with each mutation that takes the sequence further away from an exactly bi-stable protein. With relaxed selection pressures, we found that bi-stable proteins in our model are highly successful under simulated adaptive conflict. Inspired by these model predictions, we developed a method to identify real proteins in the PDB with bridge-like properties, and have verified a clear bi-stability gradient for a series of mutants studied by Alexander et al. (Proc Nat Acad Sci USA 2009, 106:21149–21154) that connect two sequences that fold uniquely into two different native structures via a bridge-like intermediate mutant sequence. Based on these findings, new testable predictions for future studies on protein bi-stability and evolution are discussed.  相似文献   

19.
Abstract Protein structures are much more conserved than sequences during evolution. Based on this observation, we investigate the consequences of structural conservation on protein evolution. We study seven of the most studied protein folds, determining that an extended neutral network in sequence space is associated with each of them. Within our model, neutral evolution leads to a non-Poissonian substitution process, due to the broad distribution of connectivities in neutral networks. The observation that the substitution process has non-Poissonian statistics has been used to argue against the original Kimura neutral theory, while our model shows that this is a generic property of neutral evolution with structural conservation. Our model also predicts that the substitution rate can strongly fluctuate from one branch to another of the evolutionary tree. The average sequence similarity within a neutral network is close to the threshold of randomness, as observed for families of sequences sharing the same fold. Nevertheless, some positions are more difficult to mutate than others. We compare such structurally conserved positions to positions conserved in protein evolution, suggesting that our model can be a valuable tool to distinguish structural from functional conservation in databases of protein families. These results indicate that a synergy between database analysis and structurally based computational studies can increase our understanding of protein evolution.  相似文献   

20.
It is commonly believed that similarities between the sequences of two proteins infer similarities between their structures. Sequence alignments reliably recognize pairs of protein of similar structures provided that the percentage sequence identity between their two sequences is sufficiently high. This distinction, however, is statistically less reliable when the percentage sequence identity is lower than 30% and little is known then about the detailed relationship between the two measures of similarity. Here, we investigate the inverse correlation between structural similarity and sequence similarity on 12 protein structure families. We define the structure similarity between two proteins as the cRMS distance between their structures. The sequence similarity for a pair of proteins is measured as the mean distance between the sequences in the subsets of sequence space compatible with their structures. We obtain an approximation of the sequence space compatible with a protein by designing a collection of protein sequences both stable and specific to the structure of that protein. Using these measures of sequence and structure similarities, we find that structural changes within a protein family are linearly related to changes in sequence similarity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号