首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Pseudoknots are an essential feature of RNA tertiary structures. Simple H-type pseudoknots have been studied extensively in terms of biological functions, computational prediction, and energy models. Intramolecular kissing hairpins are a more complex and biologically important type of pseudoknot in which two hairpin loops form base pairs. They are hard to predict using free energy minimization due to high computational requirements. Heuristic methods that allow arbitrary pseudoknots strongly depend on the quality of energy parameters, which are not yet available for complex pseudoknots. We present an extension of the heuristic pseudoknot prediction algorithm DotKnot, which covers H-type pseudoknots and intramolecular kissing hairpins. Our framework allows for easy integration of advanced H-type pseudoknot energy models. For a test set of RNA sequences containing kissing hairpins and other types of pseudoknot structures, DotKnot outperforms competing methods from the literature. DotKnot is available as a web server under http://dotknot.csse.uwa.edu.au.  相似文献   

2.
Accurate prediction of RNA pseudoknotted secondary structures from the base sequence is a challenging computational problem. Since prediction algorithms rely on thermodynamic energy models to identify low-energy structures, prediction accuracy relies in large part on the quality of free energy change parameters. In this work, we use our earlier constraint generation and Boltzmann likelihood parameter estimation methods to obtain new energy parameters for two energy models for secondary structures with pseudoknots, namely, the Dirks–Pierce (DP) and the Cao–Chen (CC) models. To train our parameters, and also to test their accuracy, we create a large data set of both pseudoknotted and pseudoknot-free secondary structures. In addition to structural data our training data set also includes thermodynamic data, for which experimentally determined free energy changes are available for sequences and their reference structures. When incorporated into the HotKnots prediction algorithm, our new parameters result in significantly improved secondary structure prediction on our test data set. Specifically, the prediction accuracy when using our new parameters improves from 68% to 79% for the DP model, and from 70% to 77% for the CC model.  相似文献   

3.
Accurate prediction of pseudoknotted nucleic acid secondary structure is an important computational challenge. Prediction algorithms based on dynamic programming aim to find a structure with minimum free energy according to some thermodynamic ("sum of loop energies") model that is implicit in the recurrences of the algorithm. However, a clear definition of what exactly are the loops in pseudoknotted structures, and their associated energies, has been lacking. In this work, we present a complete classification of loops in pseudoknotted nucleic secondary structures, and describe the Rivas and Eddy and other energy models as sum-of-loops energy models. We give a linear time algorithm for parsing a pseudoknotted secondary structure into its component loops. We give two applications of our parsing algorithm. The first is a linear time algorithm to calculate the free energy of a pseudoknotted secondary structure. This is useful for heuristic prediction algorithms, which are widely used since (pseudoknotted) RNA secondary structure prediction is NP-hard. The second application is a linear time algorithm to test the generality of the dynamic programming algorithm of Akutsu for secondary structure prediction.Together with previous work, we use this algorithm to compare the generality of state-of-the-art algorithms on real biological structures.  相似文献   

4.
Shu Z  Bevilacqua PC 《Biochemistry》1999,38(46):15369-15379
Hairpins are the most common elements of RNA secondary structure, playing important roles in RNA tertiary architecture and forming protein binding sites.Triloops are common in a variety of naturally occurring RNA hairpins, but little is known about their thermodynamic stability. Reported here are the sequences and thermodynamic parameters for a variety of stable and unstable triloop hairpins. Temperature gradient gel electrophoresis (TGGE) can be used to separate a simple RNA combinatorial library based on thermal stability [Bevilacqua, J. M., and Bevilacqua, P. C. (1998) Biochemistry 45, 15877-15884]. Here we introduce the application of TGGE to separating and analyzing a complex RNA combinatorial library based on thermal stability, using an RNA triloop library. Several rounds of in vitro selection of an RNA triloop library were carried out using TGGE, and preferences for exceptionally stable and unstable closing base pairs and loop sequences were identified. For stable hairpins, the most common closing base pair is CG, and U-rich loop sequences are preferred. Closing base pairs of GC and UA result in moderately stable hairpins when combined with a stable loop sequence. For unstable hairpins, the most common closing base pairs are AU and UG, and U-rich loop sequences are no longer preferred. In general, the contributions of the closing base pair and loop sequence to overall hairpin stability appear to be additive. Thermodynamic parameters for individual hairpins determined by UV melting are generally consistent with outcomes from selection experiments, with hairpins containing a CG closing base pair having a DeltaDeltaG degrees (37) 2.1-2.5 kcal/mol more favorable than hairpins with other closing base pairs. Sequences and thermodynamic rules for triloop hairpins should aid in RNA structure prediction and determination of whether naturally occurring triloop hairpins are thermodynamically stable.  相似文献   

5.
The secondary structure of encapsidated MS2 genomic RNA poses an interesting RNA folding challenge. Cryoelectron microscopy has demonstrated that encapsidated MS2 RNA is well-ordered. Models of MS2 assembly suggest that the RNA hairpin-protein interactions and the appropriate placement of hairpins in the MS2 RNA secondary structure can guide the formation of the correct icosahedral particle. The RNA hairpin motif that is recognized by the MS2 capsid protein dimers, however, is energetically unfavorable, and thus free energy predictions are biased against this motif. Computer programs called Crumple, Sliding Windows, and Assembly provide useful tools for prediction of viral RNA secondary structures when the traditional assumptions of RNA structure prediction by free energy minimization may not apply. These methods allow incorporation of global features of the RNA fold and motifs that are difficult to include directly in minimum free energy predictions. For example, with MS2 RNA the experimental data from SELEX experiments, crystallography, and theoretical calculations of the path for the series of hairpins can be incorporated in the RNA structure prediction, and thus the influence of free energy considerations can be modulated. This approach thoroughly explores conformational space and generates an ensemble of secondary structures. The predictions from this new approach can test hypotheses and models of viral assembly and guide construction of complete three-dimensional models of virus particles.  相似文献   

6.
MOTIVATION: Modeling RNA pseudoknotted structures remains challenging. Methods have previously been developed to model RNA stem-loops successfully using stochastic context-free grammars (SCFG) adapted from computational linguistics; however, the additional complexity of pseudoknots has made modeling them more difficult. Formally a context-sensitive grammar is required, which would impose a large increase in complexity. RESULTS: We introduce a new grammar modeling approach for RNA pseudoknotted structures based on parallel communicating grammar systems (PCGS). Our new approach can specify pseudoknotted structures, while avoiding context-sensitive rules, using a single CFG synchronized with a number of regular grammars. Technically, the stochastic version of the grammar model can be as simple as an SCFG. As with SCFG, the new approach permits automatic generation of a single-RNA structure prediction algorithm for each specified pseudoknotted structure model. This approach also makes it possible to develop full probabilistic models of pseudoknotted structures to allow the prediction of consensus structures by comparative analysis and structural homology recognition in database searches.  相似文献   

7.
Accurate free energy estimation is essential for RNA structure prediction. The widely used Turner''s energy model works well for nested structures. For pseudoknotted RNAs, however, there is no effective rule for estimation of loop entropy and free energy. In this work we present a new free energy estimation method, termed the pseudoknot predictor in three-dimensional space (pk3D), which goes beyond Turner''s model. Our approach treats nested and pseudoknotted structures alike in one unifying physical framework, regardless of how complex the RNA structures are. We first test the ability of pk3D in selecting native structures from a large number of decoys for a set of 43 pseudoknotted RNA molecules, with lengths ranging from 23 to 113. We find that pk3D performs slightly better than the Dirks and Pierce extension of Turner''s rule. We then test pk3D for blind secondary structure prediction, and find that pk3D gives the best sensitivity and comparable positive predictive value (related to specificity) in predicting pseudoknotted RNA secondary structures, when compared with other methods. A unique strength of pk3D is that it also generates spatial arrangement of structural elements of the RNA molecule. Comparison of three-dimensional structures predicted by pk3D with the native structure measured by nuclear magnetic resonance or X-ray experiments shows that the predicted spatial arrangement of stems and loops is often similar to that found in the native structure. These close-to-native structures can be used as starting points for further refinement to derive accurate three-dimensional structures of RNA molecules, including those with pseudoknots.  相似文献   

8.
We present a novel topological classification of RNA secondary structures with pseudoknots. It is based on the topological genus of the circular diagram associated to the RNA base-pair structure. The genus is a positive integer number whose value quantifies the topological complexity of the folded RNA structure. In such a representation, planar diagrams correspond to pure RNA secondary structures and have zero genus, whereas non-planar diagrams correspond to pseudoknotted structures and have higher genus. The topological genus allows for the definition of topological folding motifs, similar in spirit to those introduced and commonly used in protein folding. We analyze real RNA structures from the databases Worldwide Protein Data Bank and Pseudobase and classify them according to their topological genus. For simplicity, we limit our analysis by considering only Watson-Crick complementary base pairs and G-U wobble base pairs. We compare the results of our statistical survey with existing theoretical and numerical models. We also discuss possible applications of this classification and show how it can be used for identifying new RNA structural motifs.  相似文献   

9.
RNA secondary structure prediction using free energy minimization is one method to gain an approximation of structure. Constraints generated by enzymatic mapping or chemical modification can improve the accuracy of secondary structure prediction. We report a facile method that identifies single-stranded regions in RNA using short, randomized DNA oligonucleotides and RNase H cleavage. These regions are then used as constraints in secondary structure prediction. This method was used to improve the secondary structure prediction of Escherichia coli 5S rRNA. The lowest free energy structure without constraints has only 27% of the base pairs present in the phylogenetic structure. The addition of constraints from RNase H cleavage improves the prediction to 100% of base pairs. The same method was used to generate secondary structure constraints for yeast tRNAPhe, which is accurately predicted in the absence of constraints (95%). Although RNase H mapping does not improve secondary structure prediction, it does eliminate all other suboptimal structures predicted within 10% of the lowest free energy structure. The method is advantageous over other single-stranded nucleases since RNase H is functional in physiological conditions. Moreover, it can be used for any RNA to identify accessible binding sites for oligonucleotides or small molecules.  相似文献   

10.
A stable RNA helix requires at least three base pairs. Surprisingly, a tertiary kissing complex formed between two GACG hairpin loops contains only two GC pairs. In the NMR structure of this complex, the two flanking adenosines stack on the kissing GC pair. This observation raised a possibility that the 5’-dangling adenines contribute to the formation and stability of the kissing interaction. To test this hypothesis, we took a two-pronged approach to examine the effects of various mutational and chemical modifications of the flanking adenosines on the folding of the kissing complex. Using mass spectrometry, we studied formation of kissing dimers formed by different hairpins. Using optical tweezers, we monitored mechanical unfolding of intramolecular kissing complex at single-molecule level. In both experiments, replacing adenine with uridine abolished the kissing interaction, suggesting that a minimal kissing complex must contain two GC pairs flanked by inter-strand stacking adenines. The stabilizing effect by the adenines can be explained by the fact that the stacking purine nucleobases shield the hydrogen bonds of the adjacent GC pairs, preventing them from fraying. Unlike in the context of secondary structure, the 5’-unpaired adenines in the tertiary structure are structurally constrained in a way that allows for effective stacking onto the adjacent base pairs.  相似文献   

11.
Computational tools for prediction of the secondary structure of two or more interacting nucleic acid molecules are useful for understanding mechanisms for ribozyme function, determining the affinity of an oligonucleotide primer to its target, and designing good antisense oligonucleotides, novel ribozymes, DNA code words, or nanostructures. Here, we introduce new algorithms for prediction of the minimum free energy pseudoknot-free secondary structure of two or more nucleic acid molecules, and for prediction of alternative low-energy (sub-optimal) secondary structures for two nucleic acid molecules. We provide a comprehensive analysis of our predictions against secondary structures of interacting RNA molecules drawn from the literature. Analysis of our tools on 17 sequences of up to 200 nucleotides that do not form pseudoknots shows that they have 79% accuracy, on average, for the minimum free energy predictions. When the best of 100 sub-optimal foldings is taken, the average accuracy increases to 91%. The accuracy decreases as the sequences increase in length and as the number of pseudoknots and tertiary interactions increases. Our algorithms extend the free energy minimization algorithm of Zuker and Stiegler for secondary structure prediction, and the sub-optimal folding algorithm by Wuchty et al. Implementations of our algorithms are freely available in the package MultiRNAFold.  相似文献   

12.
We present HotKnots, a new heuristic algorithm for the prediction of RNA secondary structures including pseudoknots. Based on the simple idea of iteratively forming stable stems, our algorithm explores many alternative secondary structures, using a free energy minimization algorithm for pseudoknot free secondary structures to identify promising candidate stems. In an empirical evaluation of the algorithm with 43 sequences taken from the Pseudobase database and from the literature on pseudoknotted structures, we found that overall, in terms of the sensitivity and specificity of predictions, HotKnots outperforms the well-known Pseudoknots algorithm of Rivas and Eddy and the NUPACK algorithm of Dirks and Pierce, both based on dynamic programming approaches for limited classes of pseudoknotted structures. It also outperforms the heuristic Iterated Loop Matching algorithm of Ruan and colleagues, and in many cases gives better results than the genetic algorithm from the STAR package of van Batenburg and colleagues and the recent pknotsRG-mfe algorithm of Reeder and Giegerich. The HotKnots algorithm has been implemented in C/C++ and is available from http://www.cs.ubc.ca/labs/beta/Software/HotKnots.  相似文献   

13.
Free energy minimization has been the most popular method for RNA secondary structure prediction for decades. It is based on a set of empirical free energy change parameters derived from experiments using a nearest-neighbor model. In this study, a program, MaxExpect, that predicts RNA secondary structure by maximizing the expected base-pair accuracy, is reported. This approach was first pioneered in the program CONTRAfold, using pair probabilities predicted with a statistical learning method. Here, a partition function calculation that utilizes the free energy change nearest-neighbor parameters is used to predict base-pair probabilities as well as probabilities of nucleotides being single-stranded. MaxExpect predicts both the optimal structure (having highest expected pair accuracy) and suboptimal structures to serve as alternative hypotheses for the structure. Tested on a large database of different types of RNA, the maximum expected accuracy structures are, on average, of higher accuracy than minimum free energy structures. Accuracy is measured by sensitivity, the percentage of known base pairs correctly predicted, and positive predictive value (PPV), the percentage of predicted pairs that are in the known structure. By favoring double-strandedness or single-strandedness, a higher sensitivity or PPV of prediction can be favored, respectively. Using MaxExpect, the average PPV of optimal structure is improved from 66% to 68% at the same sensitivity level (73%) compared with free energy minimization.  相似文献   

14.
An RNA secondary structure is saturated if no base pairs can be added without violating the definition of secondary structure. Here we describe a new algorithm, RNAsat, which for a given RNA sequence a, an integral temperature 0 相似文献   

15.
16.
The paper investigates the computational problem of predicting RNA secondary structures. The general belief is that allowing pseudoknots makes the problem hard. Existing polynomial-time algorithms are heuristic algorithms with no performance guarantee and can handle only limited types of pseudoknots. In this paper, we initiate the study of predicting RNA secondary structures with a maximum number of stacking pairs while allowing arbitrary pseudoknots. We obtain two approximation algorithms with worst-case approximation ratios of 1/2 and 1/3 for planar and general secondary structures, respectively. For an RNA sequence of n bases, the approximation algorithm for planar secondary structures runs in O(n(3)) time while that for the general case runs in linear time. Furthermore, we prove that allowing pseudoknots makes it NP-hard to maximize the number of stacking pairs in a planar secondary structure. This result is in contrast with the recent NP-hard results on psuedoknots which are based on optimizing some general and complicated energy functions.  相似文献   

17.
A number of non-coding RNA are known to contain functionally important or conserved pseudoknots. However, pseudoknotted structures are more complex than orthodox, and most methods for analyzing secondary structures do not handle them. I present here a way to decompose and represent general secondary structures which extends the tree representation of the stem-loop structure, and use this to analyze the frequency of pseudoknots in known and in random secondary structures. This comparison shows that, though a number of pseudoknots exist, they are still relatively rare and mostly of the simpler kinds. In contrast, random secondary structures tend to be heavily knotted, and the number of available structures increases dramatically when allowing pseudoknots. Therefore, methods for structure prediction and non-coding RNA identification that allow pseudoknots are likely to be much less powerful than those that do not, unless they penalize pseudoknots appropriately.  相似文献   

18.
The RNA PK5 (GCGAUUUCUGACCGCUUUUUUGUCAG) forms a pseudoknotted structure at low temperatures and a hairpin containing an A.C opposition at higher temperatures (J. Mol. Biol. 214, 455-470 (1990)). CD and absorption spectra of PK5 were measured at several temperatures. A basis set of spectra were fit to the spectra of PK5 using a method that can provide estimates of the numbers of A.U, G.C, and G.U base pairs as well as the number of each of 11 nearest-neighbor base pairs in an RNA (Biopolymers 31, 373-384 (1991)). The fits were close, indicating that PK5 retained the A conformation in the pseudoknot structure and that the fitting technique is not hindered by pseudoknots or A.C oppositions. The results from the analysis were consistent with the pseudoknotted structure at low temperatures and with the hairpin structure at higher temperatures. We concluded that the method of spectral analysis should be useful for determining the secondary structures of other RNAs containing pseudoknots and A.C oppositions.  相似文献   

19.
Beniaminov  A. D.  Ulyanov  N. B.  Samokhin  A. B.  Ivanov  V. I.  Du  Z.  Minyat  E. E. 《Molecular Biology》2003,37(3):446-455
The slipped loop structure, earlier identified as an unusual DNA structure, was found to be a possible element of the RNA folding. In order to experimentally test this suggestion, model oligoribonucleotides capable of forming the SLS were synthesized. Treatment of the oligoribonucleotides with nuclease S1 and RNases specific for single- and double-stranded RNA demonstrated the steric possibility of SLS formation. To determine the possible functional role of SLS-RNA, various naturally occurring RNAs were screened in silico. Among the most interesting findings were dimerization initiation sites of avian retroviral genomic RNAs. Analysis of RNA from 31 viruses showed that formation of the intermolecular SLS during RNA dimerization is theoretically possible, competing with the formation of an alternative hairpin structure. Identification of the secondary structure of selected RNA dimers employing nuclease digestion techniques as well as covariance analysis of the retroviral RNA dimerization initiation site sequences were used to show that the alternative conformation (loop–loop interaction of two hairpins, or kissing hairpins) is the most preferred. Alternative structures and conformational transitions in RNA dimerization mechanisms in avian retroviruses are discussed.  相似文献   

20.
The origin of replication ( oriR ) involved in the initiation of (-) strand enterovirus RNA synthesis is a quasi-globular multi-domain RNA structure which is maintained by a tertiary kissing interaction. The kissing interaction is formed by base pairing of complementary sequences within the predominant hairpin-loop structures of the enteroviral 3' untranslated region. In this report, we have fully characterised the kissing interaction. Site-directed mutations which affected the different base pairs involved in the kissing interaction were generated in an infectious coxsackie B3 virus cDNA clone. The kissing interaction appeared to consist of 6 bp. Distortion of the interaction by mispairing of each of the base pairs involved in this higher order RNA structure resulted in either temperature sensitive or lethal phenotypes. The nucleotide constitution of the base which gaps the major groove of the kissing domain was not relevant for virus growth. The reciprocal exchange of the complete sequence involved in the kissing resulted in a mutant virus with wild type virus growth characteristics arguing that the base pair constitution is of less importance for the initiation of (-) strand RNA synthesis than the existence of the tertiary structure itself.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号