首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

RNAMute is an interactive Java application that calculates the secondary structure of all single point mutations, given an RNA sequence, and organizes them into categories according to their similarity with respect to the wild type predicted structure. The secondary structure predictions are performed using the Vienna RNA package. Several alternatives are used for the categorization of single point mutations: Vienna's RNAdistance based on dot-bracket representation, as well as tree edit distance and second eigenvalue of the Laplacian matrix based on Shapiro's coarse grain tree graph representation.  相似文献   

2.
The analysis of the relationship between sequences and structures (i.e., how mutations affect structures and reciprocally how structures influence mutations) is essential to decipher the principles driving molecular evolution, to infer the origins of genetic diseases, and to develop bioengineering applications such as the design of artificial molecules. Because their structures can be predicted from the sequence data only, RNA molecules provide a good framework to study this sequence-structure relationship. We recently introduced a suite of algorithms called RNAmutants which allows a complete exploration of RNA sequence-structure maps in polynomial time and space. Formally, RNAmutants takes an input sequence (or seed) to compute the Boltzmann-weighted ensembles of mutants with exactly k mutations, and sample mutations from these ensembles. However, this approach suffers from major limitations. Indeed, since the Boltzmann probabilities of the mutations depend of the free energy of the structures, RNAmutants has difficulties to sample mutant sequences with low G+C-contents. In this article, we introduce an unbiased adaptive sampling algorithm that enables RNAmutants to sample regions of the mutational landscape poorly covered by classical algorithms. We applied these methods to sample mutations with low G+C-contents. These adaptive sampling techniques can be easily adapted to explore other regions of the sequence and structural landscapes which are difficult to sample. Importantly, these algorithms come at a minimal computational cost. We demonstrate the insights offered by these techniques on studies of complete RNA sequence structures maps of sizes up to 40 nucleotides. Our results indicate that the G+C-content has a strong influence on the size and shape of the evolutionary accessible sequence and structural spaces. In particular, we show that low G+C-contents favor the apparition of internal loops and thus possibly the synthesis of tertiary structure motifs. On the other hand, high G+C-contents significantly reduce the size of the evolutionary accessible mutational landscapes.  相似文献   

3.
Vienna RNA secondary structure server   总被引:1,自引:0,他引:1       下载免费PDF全文
The Vienna RNA secondary structure server provides a web interface to the most frequently used functions of the Vienna RNA software package for the analysis of RNA secondary structures. It currently offers prediction of secondary structure from a single sequence, prediction of the consensus secondary structure for a set of aligned sequences and the design of sequences that will fold into a predefined structure. All three services can be accessed via the Vienna RNA web server at http://rna.tbi.univie.ac.at/.  相似文献   

4.
The diversity and importance of the role played by RNAs in the regulation and development of the cell are now well-known and well-documented. This broad range of functions is achieved through specific structures that have been (presumably) optimized through evolution. State-of-the-art methods, such as McCaskill's algorithm, use a statistical mechanics framework based on the computation of the partition function over the canonical ensemble of all possible secondary structures on a given sequence. Although secondary structure predictions from thermodynamics-based algorithms are not as accurate as methods employing comparative genomics, the former methods are the only available tools to investigate novel RNAs, such as the many RNAs of unknown function recently reported by the ENCODE consortium. In this paper, we generalize the McCaskill partition function algorithm to sum over the grand canonical ensemble of all secondary structures of all mutants of the given sequence. Specifically, our new program, RNAmutants, simultaneously computes for each integer k the minimum free energy structure MFE(k) and the partition function Z(k) over all secondary structures of all k-point mutants, even allowing the user to specify certain positions required not to mutate and certain positions required to base-pair or remain unpaired. This technically important extension allows us to study the resilience of an RNA molecule to pointwise mutations. By computing the mutation profile of a sequence, a novel graphical representation of the mutational tendency of nucleotide positions, we analyze the deleterious nature of mutating specific nucleotide positions or groups of positions. We have successfully applied RNAmutants to investigate deleterious mutations (mutations that radically modify the secondary structure) in the Hepatitis C virus cis-acting replication element and to evaluate the evolutionary pressure applied on different regions of the HIV trans-activation response element. In particular, we show qualitative agreement between published Hepatitis C and HIV experimental mutagenesis studies and our analysis of deleterious mutations using RNAmutants. Our work also predicts other deleterious mutations, which could be verified experimentally. Finally, we provide evidence that the 3' UTR of the GB RNA virus C has been optimized to preserve evolutionarily conserved stem regions from a deleterious effect of pointwise mutations. We hope that there will be long-term potential applications of RNAmutants in de novo RNA design and drug design against RNA viruses. This work also suggests potential applications for large-scale exploration of the RNA sequence-structure network. Binary distributions are available at http://RNAmutants.csail.mit.edu/.  相似文献   

5.
Deleterious mutation prediction in the secondary structure of RNAs   总被引:1,自引:0,他引:1       下载免费PDF全文
Barash D 《Nucleic acids research》2003,31(22):6578-6584
  相似文献   

6.
Algorithms for prediction of RNA secondary structure-the set of base pairs that form when an RNA molecule folds-are valuable to biologists who aim to understand RNA structure and function. Improving the accuracy and efficiency of prediction methods is an ongoing challenge, particularly for pseudoknotted secondary structures, in which base pairs overlap. This challenge is biologically important, since pseudoknotted structures play essential roles in functions of many RNA molecules, such as splicing and ribosomal frameshifting. State-of-the-art methods, which are based on free energy minimization, have high run-time complexity (typically Theta(n(5)) or worse), and can handle (minimize over) only limited types of pseudoknotted structures. We propose a new approach for prediction of pseudoknotted structures, motivated by the hypothesis that RNA structures fold hierarchically, with pseudoknot-free (non-overlapping) base pairs forming first, and pseudoknots forming later so as to minimize energy relative to the folded pseudoknot-free structure. Our HFold algorithm uses two-phase energy minimization to predict hierarchically formed secondary structures in O(n(3)) time, matching the complexity of the best algorithms for pseudoknot-free secondary structure prediction via energy minimization. Our algorithm can handle a wide range of biological structures, including kissing hairpins and nested kissing hairpins, which have previously required Theta(n(6)) time.  相似文献   

7.
8.
We developed a method for deep mutational scanning of antibody complementarity-determining regions (CDRs) that can determine in parallel the effect of every possible single amino acid CDR substitution on antigen binding. The method uses libraries of full length IgGs containing more than 1000 CDR point mutations displayed on mammalian cells, sorted by flow cytometry into subpopulations based on antigen affinity and analyzed by massively parallel pyrosequencing. Higher, lower and neutral affinity mutations are identified by their enrichment or depletion in the FACS subpopulations. We applied this method to a humanized version of the anti-epidermal growth factor receptor antibody cetuximab, generated a near comprehensive data set for 1060 point mutations that recapitulates previously determined structural and mutational data for these CDRs and identified 67 point mutations that increase affinity. The large-scale, comprehensive sequence-function data sets generated by this method should have broad utility for engineering properties such as antibody affinity and specificity and may advance theoretical understanding of antibody-antigen recognition.  相似文献   

9.

Background  

RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(n m ) for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure.  相似文献   

10.
The mutational landscape model is a theoretical model describing sequence evolution in natural populations. However, recent experimental work has begun to test its predictions in laboratory populations of microbes. Several of these studies have focused on testing the prediction that the effects of beneficial mutations should be roughly exponentially distributed. The prediction appears to be borne out by most of these studies, at least qualitatively. Another study showed that a modified version of the model was able to predict, with reasonable accuracy, which of a ranked set of beneficial alleles will be fixed next. Although it remains to be seen whether the mutational landscape model adequately describes adaptation in organisms other than microbes, together these studies suggest that adaptive evolution has surprisingly general properties that can be successfully captured by theoretical models.  相似文献   

11.
Commonly used RNA folding programs compute the minimum free energy structure of a sequence under the pseudoknot exclusion constraint. They are based on Zuker's algorithm which runs in time O(n(3)). Recently, it has been claimed that RNA folding can be achieved in average time O(n(2)) using a sparsification technique. A proof of quadratic time complexity was based on the assumption that computational RNA folding obeys the "polymer-zeta property". Several variants of sparse RNA folding algorithms were later developed. Here, we present our own version, which is readily applicable to existing RNA folding programs, as it is extremely simple and does not require any new data structure. We applied it to the widely used Vienna RNAfold program, to create sibRNAfold, the first public sparsified version of a standard RNA folding program. To gain a better understanding of the time complexity of sparsified RNA folding in general, we carried out a thorough run time analysis with synthetic random sequences, both in the context of energy minimization and base pairing maximization. Contrary to previous claims, the asymptotic time complexity of a sparsified RNA folding algorithm using standard energy parameters remains O(n(3)) under a wide variety of conditions. Consistent with our run-time analysis, we found that RNA folding does not obey the "polymer-zeta property" as claimed previously. Yet, a basic version of a sparsified RNA folding algorithm provides 15- to 50-fold speed gain. Surprisingly, the same sparsification technique has a different effect when applied to base pairing optimization. There, its asymptotic running time complexity appears to be either quadratic or cubic depending on the base composition. The code used in this work is available at: .  相似文献   

12.
The folding process defines three‐dimensional protein structures from their amino acid chains. A protein's structure determines its activity and properties; thus knowing such conformation on an atomic level is essential for both basic and applied studies of protein function and dynamics. However, the acquisition of such structures by experimental methods is slow and expensive, and current computational methods mostly depend on previously known structures to determine new ones. Here we present a new software called GSAFold that applies the generalized simulated annealing (GSA) algorithm on ab initio protein structure prediction. The GSA is a stochastic search algorithm employed in energy minimization and used in global optimization problems, especially those that depend on long‐range interactions, such as gravity models and conformation optimization of small molecules. This new implementation applies, for the first time in ab initio protein structure prediction, an analytical inverse for the Visitation function of GSA. It also employs the broadly used NAMD Molecular Dynamics package to carry out energy calculations, allowing the user to select different force fields and parameterizations. Moreover, the software also allows the execution of several simulations simultaneously. Applications that depend on protein structures include rational drug design and structure‐based protein function prediction. Applying GSAFold in a test peptide, it was possible to predict the structure of mastoparan‐X to a root mean square deviation of 3.00 Å. Proteins 2012; © 2012 Wiley Periodicals, Inc.  相似文献   

13.
As one of the earliest problems in computational biology, RNA secondary structure prediction (sometimes referred to as "RNA folding") problem has attracted attention again, thanks to the recent discoveries of many novel non-coding RNA molecules. The two common approaches to this problem are de novo prediction of RNA secondary structure based on energy minimization and the consensus folding approach (computing the common secondary structure for a set of unaligned RNA sequences). Consensus folding algorithms work well when the correct seed alignment is part of the input to the problem. However, seed alignment itself is a challenging problem for diverged RNA families. In this paper, we propose a novel framework to predict the common secondary structure for unaligned RNA sequences. By matching putative stacks in RNA sequences, we make use of both primary sequence information and thermodynamic stability for prediction at the same time. We show that our method can predict the correct common RNA secondary structures even when we are given only a limited number of unaligned RNA sequences, and it outperforms current algorithms in sensitivity and accuracy.  相似文献   

14.
RNA pseudoknot prediction in energy-based models.   总被引:11,自引:0,他引:11  
RNA molecules are sequences of nucleotides that serve as more than mere intermediaries between DNA and proteins, e.g., as catalytic molecules. Computational prediction of RNA secondary structure is among the few structure prediction problems that can be solved satisfactorily in polynomial time. Most work has been done to predict structures that do not contain pseudoknots. Allowing pseudoknots introduces modeling and computational problems. In this paper we consider the problem of predicting RNA secondary structures with pseudoknots based on free energy minimization. We first give a brief comparison of energy-based methods for predicting RNA secondary structures with pseudoknots. We then prove that the general problem of predicting RNA secondary structures containing pseudoknots is NP complete for a large class of reasonable models of pseudoknots.  相似文献   

15.
The comprehensive sequence determinants of binding affinity for type I cohesin toward dockerin from Clostridium thermocellum and Clostridium cellulolyticum was evaluated using deep mutational scanning coupled to yeast surface display. We measured the relative binding affinity to dockerin for 2970 and 2778 single point mutants of C. thermocellum and C. cellulolyticum, respectively, representing over 96% of all possible single point mutants. The interface ΔΔG for each variant was reconstructed from sequencing counts and compared with the three independent experimental methods. This reconstruction results in a narrow dynamic range of ?0.8–0.5 kcal/mol. The computational software packages FoldX and Rosetta were used to predict mutations that disrupt binding by more than 0.4 kcal/mol. The area under the curve of receiver operator curves was 0.82 for FoldX and 0.77 for Rosetta, showing reasonable agreements between predictions and experimental results. Destabilizing mutations to core and rim positions were predicted with higher accuracy than support positions. This benchmark dataset may be useful for developing new computational prediction tools for the prediction of the mutational effect on binding affinities for protein–protein interactions. Experimental considerations to improve precision and range of the reconstruction method are discussed. Proteins 2016; 84:1914–1928. © 2016 Wiley Periodicals, Inc.  相似文献   

16.
We investigate the relationship between the average fitness decay due to single mutations and the strength of epistatic interactions in genetic sequences. We observe that epistatic interactions between mutations are correlated to the average fitness decay, both in RNA secondary structure prediction as well as in digital organisms replicating in silico. This correlation implies that, during adaptation, epistasis and average mutational effect cannot be optimized independently. In experiments with RNA sequences evolving on a neutral network, the selective pressure to decrease the mutational load then leads to a reduction in the amount of sequences with strong antagonistic interactions between deleterious mutations in the population.  相似文献   

17.
Single-nucleotide polymorphisms (SNPs) are often linked to critical phenotypes such as diseases or responses to vaccines, medications and environmental factors. However, the specific molecular mechanisms by which a causal SNP acts is usually not obvious. Changes in RNA secondary structure emerge as a possible explanation necessitating the development of methods to measure the impact of single-nucleotide variation on RNA structure. Despite the recognition of the importance of considering the changes in Boltzmann ensemble of RNA conformers in this context, a formal method to perform directly such comparison was lacking. Here, we solved this problem and designed an efficient method to compute the relative entropy between the Boltzmann ensembles of the native and a mutant structure. On the basis of this theoretical progress, we developed a software tool, remuRNA, and investigated examples of its application. Comparing the impact of common SNPs naturally occurring in populations with the impact of random point mutations, we found that structural changes introduced by common SNPs are smaller than those introduced by random point mutations. This suggests a natural selection against mutations that significantly change RNA structure and demonstrates, surprisingly, that randomly inserted point mutations provide inadequate estimation of random mutations effects. Subsequently, we applied remuRNA to determine which of the disease-associated non-coding SNPs are potentially related to RNA structural changes.  相似文献   

18.
Metcalf DG  Law PB  DeGrado WF 《Proteins》2007,67(2):375-384
We present a molecular modeling protocol that selects modeled protein structures based on experimental mutagenesis results. The computed effect of a point mutation should be consistent with its experimental effect for correct models; mutations that do not affect protein stability and function should not affect the computed energy of a correct model while destabilizing mutations should have unfavorable computed energies. On the other hand, an incorrect model will likely display computed energies that are inconsistent with experimental results. We added terms to our energy function which penalize models that are inconsistent with experimental results. This creates a selective advantage for models that are consistent with experimental results in the Monte Carlo simulated annealing protocol we use to search conformational space. We calibrated our protocol to predict the structure of transmembrane helix dimers using glycophorin A as a model system. Inclusion of mutational data in this protocol compensates for the limitations of our force field and the limitations of our conformational search. We demonstrate an application of this structure prediction protocol by modeling the transmembrane region of the BNIP3 apoptosis factor.  相似文献   

19.

Background

A detailed understanding of an RNA's correct secondary and tertiary structure is crucial to understanding its function and mechanism in the cell. Free energy minimization with energy parameters based on the nearest-neighbor model and comparative analysis are the primary methods for predicting an RNA's secondary structure from its sequence. Version 3.1 of Mfold has been available since 1999. This version contains an expanded sequence dependence of energy parameters and the ability to incorporate coaxial stacking into free energy calculations. We test Mfold 3.1 by performing the largest and most phylogenetically diverse comparison of rRNA and tRNA structures predicted by comparative analysis and Mfold, and we use the results of our tests on 16S and 23S rRNA sequences to assess the improvement between Mfold 2.3 and Mfold 3.1.

Results

The average prediction accuracy for a 16S or 23S rRNA sequence with Mfold 3.1 is 41%, while the prediction accuracies for the majority of 16S and 23S rRNA structures tested are between 20% and 60%, with some having less than 20% prediction accuracy. The average prediction accuracy was 71% for 5S rRNA and 69% for tRNA. The majority of the 5S rRNA and tRNA sequences have prediction accuracies greater than 60%. The prediction accuracy of 16S rRNA base-pairs decreases exponentially as the number of nucleotides intervening between the 5' and 3' halves of the base-pair increases.

Conclusion

Our analysis indicates that the current set of nearest-neighbor energy parameters in conjunction with the Mfold folding algorithm are unable to consistently and reliably predict an RNA's correct secondary structure. For 16S or 23S rRNA structure prediction, Mfold 3.1 offers little improvement over Mfold 2.3. However, the nearest-neighbor energy parameters do work well for shorter RNA sequences such as tRNA or 5S rRNA, or for larger rRNAs when the contact distance between the base-pairs is less than 100 nucleotides.  相似文献   

20.
Scales in RNA, based on geometrical considerations, can be exploited for the analysis and prediction of RNA structures. By using spectral decomposition, geometric information that relates to a given RNA fold can be reduced to a single positive scalar number, the second eigenvalue of the Laplacian matrix corresponding to the tree-graph representation of the RNA secondary structure. Along with the free energy of the structure, being the most important scalar number in the prediction of RNA folding by energy minimization methods, the second eigenvalue of the Laplacian matrix can be used as an effective signature for locating a target folded structure given a set of RNA folds. Furthermore, the second eigenvector of the Laplacian matrix can be used to partition large RNA structures into smaller fragments. An illustrative example is given for the use of the second eigenvalue to predict mutations that may cause structural rearrangements, thereby disrupting stable motifs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号