首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The diversity and importance of the role played by RNAs in the regulation and development of the cell are now well-known and well-documented. This broad range of functions is achieved through specific structures that have been (presumably) optimized through evolution. State-of-the-art methods, such as McCaskill's algorithm, use a statistical mechanics framework based on the computation of the partition function over the canonical ensemble of all possible secondary structures on a given sequence. Although secondary structure predictions from thermodynamics-based algorithms are not as accurate as methods employing comparative genomics, the former methods are the only available tools to investigate novel RNAs, such as the many RNAs of unknown function recently reported by the ENCODE consortium. In this paper, we generalize the McCaskill partition function algorithm to sum over the grand canonical ensemble of all secondary structures of all mutants of the given sequence. Specifically, our new program, RNAmutants, simultaneously computes for each integer k the minimum free energy structure MFE(k) and the partition function Z(k) over all secondary structures of all k-point mutants, even allowing the user to specify certain positions required not to mutate and certain positions required to base-pair or remain unpaired. This technically important extension allows us to study the resilience of an RNA molecule to pointwise mutations. By computing the mutation profile of a sequence, a novel graphical representation of the mutational tendency of nucleotide positions, we analyze the deleterious nature of mutating specific nucleotide positions or groups of positions. We have successfully applied RNAmutants to investigate deleterious mutations (mutations that radically modify the secondary structure) in the Hepatitis C virus cis-acting replication element and to evaluate the evolutionary pressure applied on different regions of the HIV trans-activation response element. In particular, we show qualitative agreement between published Hepatitis C and HIV experimental mutagenesis studies and our analysis of deleterious mutations using RNAmutants. Our work also predicts other deleterious mutations, which could be verified experimentally. Finally, we provide evidence that the 3' UTR of the GB RNA virus C has been optimized to preserve evolutionarily conserved stem regions from a deleterious effect of pointwise mutations. We hope that there will be long-term potential applications of RNAmutants in de novo RNA design and drug design against RNA viruses. This work also suggests potential applications for large-scale exploration of the RNA sequence-structure network. Binary distributions are available at http://RNAmutants.csail.mit.edu/.  相似文献   

2.
Programs for RNA mutational analysis that are structure-based and rely on secondary structure prediction have been developed and expanded in the past several years. They can be used for a variety of purposes, such as in suggesting point mutations that will alter RNA virus replication or translation initiation, investigating the effect of deleterious and compensatory mutations in allosteric ribozymes and riboswitches, computing an optimal path of mutations to get from one ribozyme fold to another, or analyzing regulatory RNA sequences by their mutational profile. This review describes three different freeware programs (RNAMute, RDMAS and RNAmutants) that have been developed for such purposes. RNAMute and RDMAS in principle perform energy minimization prediction by available software such as RNAfold from the Vienna RNA package or Zuker's Mfold, while RNAmutants provides an efficient method using essential ingredients from energy minimization prediction. Both RNAMute in its extended version that uses RNAsubopt from the Vienna RNA package and the RNAmutants software are able to predict multiple-point mutations using developed methodologies, while RDMAS is currently restricted to single-point mutations. The strength of RNAMute in its extended version is the ability to predict a small number of point mutations in an accurate manner. RNAmutants is well fit for large scale simulations involving the calculation of all k-mutants, where k can be a large integer number, of a given RNA sequence.  相似文献   

3.
Functional effects of different mutations are known to combine to the total effect in highly nontrivial ways. For the trait under evolutionary selection ('fitness'), measured values over all possible combinations of a set of mutations yield a fitness landscape that determines which mutational states can be reached from a given initial genotype. Understanding the accessibility properties of fitness landscapes is conceptually important in answering questions about the predictability and repeatability of evolutionary adaptation. Here we theoretically investigate accessibility of the globally optimal state on a wide variety of model landscapes, including landscapes with tunable ruggedness as well as neutral 'holey' landscapes. We define a mutational pathway to be accessible if it contains the minimal number of mutations required to reach the target genotype, and if fitness increases in each mutational step. Under this definition accessibility is high, in the sense that at least one accessible pathway exists with a substantial probability that approaches unity as the dimensionality of the fitness landscape (set by the number of mutational loci) becomes large. At the same time the number of alternative accessible pathways grows without bounds. We test the model predictions against an empirical 8-locus fitness landscape obtained for the filamentous fungus Aspergillus niger. By analyzing subgraphs of the full landscape containing different subsets of mutations, we are able to probe the mutational distance scale in the empirical data. The predicted effect of high accessibility is supported by the empirical data and is very robust, which we argue reflects the generic topology of sequence spaces. Together with the restrictive assumptions that lie in our definition of accessibility, this implies that the globally optimal configuration should be accessible to genome wide evolution, but the repeatability of evolutionary trajectories is limited owing to the presence of a large number of alternative mutational pathways.  相似文献   

4.
5.
The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences'' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet.  相似文献   

6.
In RNA fitness landscapes with interconnected networks of neutral mutations, neutral precursor mutations can play an important role in facilitating the accessibility of epistatic adaptive mutant combinations. I use an exhaustively surveyed fitness landscape model based on short sequence RNA genotypes (and their secondary structure phenotypes) to calculate the minimum rate at which mutants initially appearing as neutral are incorporated into an adaptive evolutionary walk. I show first, that incorporating neutral mutations significantly increases the number of point mutations in a given evolutionary walk when compared to estimates from previous adaptive walk models. Second, that incorporating neutral mutants into such a walk significantly increases the final fitness encountered on that walk - indeed evolutionary walks including neutral steps often reach the global optimum in this model. Third, and perhaps most importantly, evolutionary paths of this kind are often extremely winding in their nature and have the potential to undergo multiple mutations at a given sequence position within a single walk; the potential of these winding paths to mislead phylogenetic reconstruction is briefly considered.  相似文献   

7.
The evolution and adaptation of molecular populations is constrained by the diversity accessible through mutational processes. RNA is a paradigmatic example of biopolymer where genotype (sequence) and phenotype (approximated by the secondary structure fold) are identified in a single molecule. The extreme redundancy of the genotype-phenotype map leads to large ensembles of RNA sequences that fold into the same secondary structure and can be connected through single-point mutations. These ensembles define neutral networks of phenotypes in sequence space. Here we analyze the topological properties of neutral networks formed by 12-nucleotides RNA sequences, obtained through the exhaustive folding of sequence space. A total of 4(12) sequences fragments into 645 subnetworks that correspond to 57 different secondary structures. The topological analysis reveals that each subnetwork is far from being random: it has a degree distribution with a well-defined average and a small dispersion, a high clustering coefficient, and an average shortest path between nodes close to its minimum possible value, i.e. the Hamming distance between sequences. RNA neutral networks are assortative due to the correlation in the composition of neighboring sequences, a feature that together with the symmetries inherent to the folding process explains the existence of communities. Several topological relationships can be analytically derived attending to structural restrictions and generic properties of the folding process. The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations. This property prevents fragmentation of neutral networks and thus enhances the navigability of sequence space. In summary, RNA neutral networks show unique topological properties, unknown to other networks previously described.  相似文献   

8.
Fitness landscapes of protein and RNA molecules can be studied experimentally using high-throughput techniques to measure the functional effects of numerous combinations of mutations. The rugged topography of these molecular fitness landscapes is important for understanding and predicting natural and experimental evolution. Mutational effects are also dependent upon environmental conditions, but the effects of environmental changes on fitness landscapes remains poorly understood. Here, we investigate the changes to the fitness landscape of a catalytic RNA molecule while changing a single environmental variable that is critical for RNA structure and function. Using high-throughput sequencing of in vitro selections, we mapped a fitness landscape of the Azoarcus group I ribozyme under eight different concentrations of magnesium ions (1–48 mM MgCl2). The data revealed the magnesium dependence of 16,384 mutational neighbors, and from this, we investigated the magnesium induced changes to the topography of the fitness landscape. The results showed that increasing magnesium concentration improved the relative fitness of sequences at higher mutational distances while also reducing the ruggedness of the mutational trajectories on the landscape. As a result, as magnesium concentration was increased, simulated populations evolved toward higher fitness faster. Curve-fitting of the magnesium dependence of individual ribozymes demonstrated that deep sequencing of in vitro reactions can be used to evaluate the structural stability of thousands of sequences in parallel. Overall, the results highlight how environmental changes that stabilize structures can also alter the ruggedness of fitness landscapes and alter evolutionary processes.  相似文献   

9.
Conventional population genetics considers the evolution of a limited number of genotypes corresponding to phenotypes with different fitness. As model phenotypes, in particular RNA secondary structure, have become computationally tractable, however, it has become apparent that the context dependent effect of mutations and the many-to-one nature inherent in these genotype-phenotype maps can have fundamental evolutionary consequences. It has previously been demonstrated that populations of genotypes evolving on the neutral networks corresponding to all genotypes with the same secondary structure only through neutral mutations can evolve mutational robustness [E. van Nimwegen, J.P. Crutchfield, M. Huynen, Neutral evolution of mutational robustness, Proc. Natl. Acad. Sci. USA 96(17), 9716-9720 (1999)], by concentrating the population on regions of high neutrality. Introducing recombination we demonstrate, through numerically calculating the stationary distribution of an infinite population on ensembles of random neutral networks that mutational robustness is significantly enhanced and further that the magnitude of this enhancement is sensitive to details of the neutral network topology. Through the simulation of finite populations of genotypes evolving on random neutral networks and a scaled down microRNA neutral network, we show that even in finite populations recombination will still act to focus the population on regions of locally high neutrality.  相似文献   

10.
Viroids are plant subviral pathogens whose genomes are constituted by a single-stranded and covalently closed small RNA molecule that does not encode for any protein. Despite this genomic simplicity, they are able of inducing devastating symptoms in susceptible plants. Most of the 29 described viroid species fold into a rodlike or quasi-rodlike structure, whereas a few of them fold as branched structures. The shape of these RNA structures is perhaps one of the most characteristic properties of viroids and sometimes is considered their only phenotype. Here we use RNA thermodynamic secondary structure prediction algorithms to compare the mutational robustness of all viroid species. After characterizing the statistical properties of the distribution of mutational effects on structure stability and the wideness of neutral neighborhood for each viroid species, we show an evolutionary trend toward increased structural robustness during viroid radiation, giving support to the adaptive value of robustness. Differences in robustness among the 2 viroid families can be explained by the larger fragility of branched structures compared with the rodlike ones. We also show that genomic redundancy can contribute to the robustness of these simple RNA genomes.  相似文献   

11.
A statistical reference for RNA secondary structures with minimum free energies is computed by folding large ensembles of random RNA sequences. Four nucleotide alphabets are used: two binary alphabets, AU and GC, the biophysical AUGC and the synthetic GCXK alphabet. RNA secondary structures are made of structural elements, such as stacks, loops, joints, and free ends. Statistical properties of these elements are computed for small RNA molecules of chain lengths up to 100. The results of RNA structure statistics depend strongly on the particular alphabet chosen. The statistical reference is compared with the data derived from natural RNA molecules with similar base frequencies. Secondary structures are represented as trees. Tree editing provides a quantitative measure for the distance dt, between two structures. We compute a structure density surface as the conditional probability of two structures having distance t given that their sequences have distance h. This surface indicates that the vast majority of possible minimum free energy secondary structures occur within a fairly small neighborhood of any typical (random) sequence. Correlation lengths for secondary structures in their tree representations are computed from probability densities. They are appropriate measures for the complexity of the sequence-structure relation. The correlation length also provides a quantitative estimate for the mean sensitivity of structures to point mutations. © 1993 John Wiley & Sons, Inc.  相似文献   

12.
Landscapes exhibiting multiple secondary structures arise in natural RNA molecules that modulate gene expression, protein synthesis, and viral. We report herein that high-throughput chemical experiments can isolate an RNA’s multiple alternative secondary structures as they are stabilized by systematic mutagenesis (mutate-and-map, M2) and that a computational algorithm, REEFFIT, enables unbiased reconstruction of these states’ structures and populations. In an in silico benchmark on non-coding RNAs with complex landscapes, M2-REEFFIT recovers 95% of RNA helices present with at least 25% population while maintaining a low false discovery rate (10%) and conservative error estimates. In experimental benchmarks, M2-REEFFIT recovers the structure landscapes of a 35-nt MedLoop hairpin, a 110-nt 16S rRNA four-way junction with an excited state, a 25-nt bistable hairpin, and a 112-nt three-state adenine riboswitch with its expression platform, molecules whose characterization previously required expert mutational analysis and specialized NMR or chemical mapping experiments. With this validation, M2-REEFFIT enabled tests of whether artificial RNA sequences might exhibit complex landscapes in the absence of explicit design. An artificial flavin mononucleotide riboswitch and a randomly generated RNA sequence are found to interconvert between three or more states, including structures for which there was no design, but that could be stabilized through mutations. These results highlight the likely pervasiveness of rich landscapes with multiple secondary structures in both natural and artificial RNAs and demonstrate an automated chemical/computational route for their empirical characterization.  相似文献   

13.
Sumedha  Martin OC  Wagner A 《Bio Systems》2007,90(2):475-485
RNA secondary structure is an important computational model to understand how genetic variation maps into phenotypic (structural) variation. Evolutionary innovation in RNA structures is facilitated by neutral networks, large connected sets of RNA sequences that fold into the same structure. Our work extends and deepens previous studies on neutral networks. First, we show that even the 1-mutant neighborhood of a given sequence (genotype) G0 with structure (phenotype) P contains many structural variants that are not close to P. This holds for biological and generic RNA sequences alike. Second, we analyze the relation between new structures in the 1-neighborhoods of genotypes Gk that are only a moderate Hamming distance k away from G0, and the structure of G0 itself, both for biological and for generic RNA structures. Third, we analyze the relation between mutational robustness of a sequence and the distances of structural variants near this sequence. Our findings underscore the role of neutral networks in evolutionary innovation, and the role that high robustness can play in diminishing the potential for such innovation.  相似文献   

14.
RNA secondary-structure folding algorithms predict the existence of connected networks of RNA sequences with identical secondary structures. Fitness landscapes that are based on the mapping between RNA sequence and RNA secondary structure hence have many neutral paths. A neutral walk on these fitness landscapes gives access to a virtually unlimited number of secondary structures that are a single point mutation from the neutral path. This shows that neutral evolution explores phenotype space and can play a role in adaptation. Received: 23 December 1995 / Accepted: 17 March 1996  相似文献   

15.
What changes occur when a natural protein that had been under low mutation rates is subjected to a neutral drift at high mutational loads, thus generating genetically diverse (polymorphic) gene ensembles that all maintain the protein's original function and structure? To address this question we subjected large populations of TEM-1 beta-lactamase to a prolonged neutral drift, applying high mutation rates and purifying selection to maintain TEM-1's existing penicillinase activity. Purging of deleterious mutations and enrichment of beneficial ones maintained the sequence of these ensembles closer to TEM-1's family consensus and inferred ancestor. In particular, back-to-consensus/ancestor mutations that increase TEM-1's kinetic and thermodynamic stability were enriched. These acted as global suppressors and enabled the tolerance of a broad range of deleterious mutations, thus further increasing the genetic diversity of the drifting populations. The probability of a new function emerging (cefotaxime degradation) was also substantially increased in these ensembles owing to the presence of many gene variants carrying the global suppressors. Our findings indicate the unique features of large, polymorphic neutral ensembles generated under high mutational loads and prompt the speculation that the progenitors of today's proteins may have evolved under high mutational loads. The results also suggest that predictable back-to-consensus/ancestor changes can be used in the laboratory to generate highly diverse and evolvable gene libraries.  相似文献   

16.
An important goal in molecular biology is to understand functional changes upon single-point mutations in proteins. Doing so through a detailed characterization of structure spaces and underlying energy landscapes is desirable but continues to challenge methods based on Molecular Dynamics. In this paper we propose a novel algorithm, SIfTER, which is based instead on stochastic optimization to circumvent the computational challenge of exploring the breadth of a protein’s structure space. SIfTER is a data-driven evolutionary algorithm, leveraging experimentally-available structures of wildtype and variant sequences of a protein to define a reduced search space from where to efficiently draw samples corresponding to novel structures not directly observed in the wet laboratory. The main advantage of SIfTER is its ability to rapidly generate conformational ensembles, thus allowing mapping and juxtaposing landscapes of variant sequences and relating observed differences to functional changes. We apply SIfTER to variant sequences of the H-Ras catalytic domain, due to the prominent role of the Ras protein in signaling pathways that control cell proliferation, its well-studied conformational switching, and abundance of documented mutations in several human tumors. Many Ras mutations are oncogenic, but detailed energy landscapes have not been reported until now. Analysis of SIfTER-computed energy landscapes for the wildtype and two oncogenic variants, G12V and Q61L, suggests that these mutations cause constitutive activation through two different mechanisms. G12V directly affects binding specificity while leaving the energy landscape largely unchanged, whereas Q61L has pronounced, starker effects on the landscape. An implementation of SIfTER is made available at http://www.cs.gmu.edu/~ashehu/?q=OurTools. We believe SIfTER is useful to the community to answer the question of how sequence mutations affect the function of a protein, when there is an abundance of experimental structures that can be exploited to reconstruct an energy landscape that would be computationally impractical to do via Molecular Dynamics.  相似文献   

17.
Thermodynamic stability and mutational robustness of secondary structure are critical to the function and evolutionary longevity of RNA molecules. We hypothesize that natural and artificial selection for functional molecules favors the formation of structures that are stable to both thermal and mutational perturbation. There is little direct evidence, however, that functional RNA molecules have been selected for their stability. Here we use thermodynamic secondary structure prediction algorithms to compare the thermal and mutational robustness of over 1000 naturally and artificially evolved molecules. Although we find evidence for the evolution of both types of stability in both sets of molecules, the naturally evolved functional RNA molecules were significantly more stable than those selected in vitro, and artificially evolved catalysts (ribozymes) were more stable than artificially evolved binding species (aptamers). The thermostability of RNA molecules bred in the laboratory is probably not constrained by a lack of suitable variation in the sequence pool but, rather, by intrinsic biases in the selection process.  相似文献   

18.
19.
Ribonuclic acid (RNA) enjoys increasing interest in molecular biology; despite this interest fundamental algorithms are lacking, e.g. for identifying local motifs. As proteins, RNA molecules have a distinctive structure. Therefore, in addition to sequence information, structure plays an important part in assessing the similarity of RNAs. Furthermore, common sequence-structure features in two or several RNA molecules are often only spatially local, where possibly large parts of the molecules are dissimilar. Consequently, we address the problem of comparing RNA molecules by computing an optimal local alignment with respect to sequence and structure information. While local alignment is superior to global alignment for identifying local similarities, no general local sequence-structure alignment algorithms are currently known. We suggest a new general definition of locality for sequence-structure alignments that is biologically motivated and efficiently tractable. To show the former, we discuss locality of RNA and prove that the defined locality means connectivity by atomic and non-atomic bonds. To show the latter, we present an efficient algorithm for the newly defined pairwise local sequence-structure alignment (lssa) problem for RNA. For molecules of lengthes n and m, the algorithm has worst-case time complexity of O(n2 x m2 x max(n,m)) and a space complexity of only O(n x m). An implementation of our algorithm is available at http://www.bio.inf.uni-jena.de. Its runtime is competitive with global sequence-structure alignment.  相似文献   

20.
Several recent theoretical studies of the genetics of adaptation have focused on the mutational landscape model, which considers evolution on rugged fitness landscapes (i.e., ones having many local optima). Adaptation in this model is characterized by several simple results. Here I ask whether these results also hold on correlated fitness landscapes, which are smoother than those considered in the mutational landscape model. In particular, I study the genetics of adaptation in the block model, a tunably rugged model of fitness landscapes. Considering the scenario in which adaptation begins from a high fitness wild-type DNA sequence, I use extreme value theory and computer simulations to study both single adaptive steps and entire adaptive walks. I show that all previous results characterizing single steps in adaptation in the mutational landscape model hold at least approximately on correlated landscapes in the block model; many entire-walk results, however, do not.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号