首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Fast evaluation of internal loops in RNA secondary structure prediction.   总被引:7,自引:0,他引:7  
MOTIVATION: Though not as abundant in known biological processes as proteins, RNA molecules serve as more than mere intermediaries between DNA and proteins. Research in the last 15 years demonstrates that RNA molecules serve in many roles, including catalysis. Furthermore, RNA secondary structure prediction based on free energy rules for stacking and loop formation remains one of the few major breakthroughs in the field of structure prediction, as minimum free energy structures and related quantities can be computed with full mathematical rigor. However, with the current energy parameters, the algorithms used hitherto suffer the disadvantage of either employing heuristics that risk (though highly unlikely) missing the optimal structure or becoming prohibitively time consuming for moderate to large sequences. RESULTS: We present a new method to evaluate internal loops utilizing currently used energy rules. This method reduces the time complexity of this part of the structure prediction from O(n4) to O(n3), thus reducing the overall complexity to O(n3). Even when the size of evaluated internal loops is bounded by k (a commonly used heuristic), the method presented has a competitive edge by reducing the time complexity of internal loop evaluation from O(k2n2) to O(kn2). The method also applies to the calculation of the equilibrium partition function. AVAILABILITY: Source code for an RNA secondary structure prediction program implementing this method is available at ftp://www.ibc.wustl.edu/pub/zuker/zuker .tar.Z  相似文献   

2.
Thermodynamic parameters for internal loops of unpaired adenosines in oligoribonucleotides have been measured by optical melting studies. Comparisons are made between helices containing symmetric and asymmetric loops. Asymmetric loops destabilize a helix more than symmetric loops. The differences in free energy between symmetric and asymmetric loops are roughly half the magnitude suggested from a study of parameters required to give accurate predictions of RNA secondary structure [Papanicolaou, C., Gouy, M., & Ninio, J. (1984) Nucleic Acids Res. 12, 31-44]. Circular dichroism spectra indicate no major structural difference between helices containing symmetric and asymmetric loops. The measured sequence dependence of internal loop stability is not consistent with approximations used in current algorithms for predicting RNA secondary structure.  相似文献   

3.
Badhwar J  Karri S  Cass CK  Wunderlich EL  Znosko BM 《Biochemistry》2007,46(50):14715-14724
Thermodynamic data for RNA 1 x 2 nucleotide internal loops are lacking. Thermodynamic data that are available for 1 x 2 loops, however, are for loops that rarely occur in nature. In order to identify the most frequently occurring 1 x 2 nucleotide internal loops, a database of 955 RNA secondary structures was compiled and searched. Twenty-four RNA duplexes containing the most common 1 x 2 nucleotide loops were optically melted, and the thermodynamic parameters DeltaH degrees , DeltaS degrees , DeltaG degrees 37, and TM for each duplex were determined. This data set more than doubles the number of 1 x 2 nucleotide loops previously studied. A table of experimental free energy contributions for frequently occurring 1 x 2 nucleotide loops (as opposed to a predictive model) is likely to result in better prediction of RNA secondary structure from sequence. In order to improve free energy calculations for duplexes containing 1 x 2 nucleotide loops that do not have experimental free energy contributions, the data collected here were combined with data from 21 previously studied 1 x 2 loops. Using linear regression, the entire dataset was used to derive nearest neighbor parameters that can be used to predict the thermodynamics of previously unmeasured 1 x 2 nucleotide loops. The DeltaG degrees 37,loop and DeltaH degrees loop nearest neighbor parameters derived here were compared to values that were published previously for 1 x 2 nucleotide loops but were derived from either a significantly smaller dataset of 1 x 2 nucleotide loops or from internal loops of various sizes [Lu, Z. J., Turner, D. H., and Mathews, D. H. (2006) Nucleic Acids Res. 34, 4912-4924]. Most of these values were found to be within experimental error, suggesting that previous approximations and assumptions associated with the derivation of those nearest neighbor parameters were valid. DeltaS degrees loop nearest neighbor parameters are also reported for 1 x 2 nucleotide loops. Both the experimental thermodynamics and the nearest neighbor parameters reported here can be used to improve secondary structure prediction from sequence.  相似文献   

4.
Algorithms for prediction of RNA secondary structure-the set of base pairs that form when an RNA molecule folds-are valuable to biologists who aim to understand RNA structure and function. Improving the accuracy and efficiency of prediction methods is an ongoing challenge, particularly for pseudoknotted secondary structures, in which base pairs overlap. This challenge is biologically important, since pseudoknotted structures play essential roles in functions of many RNA molecules, such as splicing and ribosomal frameshifting. State-of-the-art methods, which are based on free energy minimization, have high run-time complexity (typically Theta(n(5)) or worse), and can handle (minimize over) only limited types of pseudoknotted structures. We propose a new approach for prediction of pseudoknotted structures, motivated by the hypothesis that RNA structures fold hierarchically, with pseudoknot-free (non-overlapping) base pairs forming first, and pseudoknots forming later so as to minimize energy relative to the folded pseudoknot-free structure. Our HFold algorithm uses two-phase energy minimization to predict hierarchically formed secondary structures in O(n(3)) time, matching the complexity of the best algorithms for pseudoknot-free secondary structure prediction via energy minimization. Our algorithm can handle a wide range of biological structures, including kissing hairpins and nested kissing hairpins, which have previously required Theta(n(6)) time.  相似文献   

5.
Hausmann NZ  Znosko BM 《Biochemistry》2012,51(26):5359-5368
To better elucidate RNA structure-function relationships and to improve the design of pharmaceutical agents that target specific RNA motifs, an understanding of RNA primary, secondary, and tertiary structure is necessary. The prediction of RNA secondary structure from sequence is an intermediate step in predicting RNA three-dimensional structure. RNA secondary structure is typically predicted using a nearest neighbor model based on free energy parameters. The current free energy parameters for 2 × 3 nucleotide loops are based on a 23-member data set of 2 × 3 loops and internal loops of other sizes. A database of representative RNA secondary structures was searched to identify 2 × 3 nucleotide loops that occur in nature. Seventeen of the most frequent 2 × 3 nucleotide loops in this database were studied by optical melting experiments. Fifteen of these loops melted in a two-state manner, and the associated experimental ΔG°(37,2×3) values are, on average, 0.6 and 0.7 kcal/mol different from the values predicted for these internal loops using the predictive models proposed by Lu, Turner, and Mathews [Lu, Z. J., Turner, D. H., and Mathews, D. H. (2006) Nucleic Acids Res. 34, 4912-4924] and Chen and Turner [Chen, G., and Turner, D. H. (2006) Biochemistry 45, 4025-4043], respectively. These new ΔG°(37,2×3) values can be used to update the current algorithms that predict secondary structure from sequence. To improve free energy calculations for duplexes containing 2 × 3 nucleotide loops that still do not have experimentally determined free energy contributions, an updated predictive model was derived. This new model resulted from a linear regression analysis of the data reported here combined with 31 previously studied 2 × 3 nucleotide internal loops. Most of the values for the parameters in this new predictive model are within experimental error of those of the previous models, suggesting that approximations and assumptions associated with the derivation of the previous nearest neighbor parameters were valid. The updated predictive model predicts free energies of 2 × 3 nucleotide internal loops within 0.4 kcal/mol, on average, of the experimental free energy values. Both the experimental values and the updated predictive model can be used to improve secondary structure prediction from sequence.  相似文献   

6.
The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.  相似文献   

7.
We have previously shown that a distal GU-rich downstream element of the mouse IgM secretory poly(A) site is important for polyadenylation in vivo and for polyadenylation specific complex formation in vitro. This element can be predicted to form a stem-loop structure with two asymmetric internal loops. As stem-loop structures commonly define protein RNA binding sites, we have probed the biological activity of the secondary structure of this element. We show that mutations affecting the stem of the structure abolish the biological activity of this element in vivo and in vitro at the level of cleavage and polyadenylation specificity factor/cleavage stimulation factor complex formation and that both internal loops contribute to the enhancing effect of the sequence in vivo. Lead (II) cleavage patterns and RNase H probing of the sequence element in vitro are consistent with the predicted secondary structure. Furthermore, mobility on native PAGE suggests a bent structure. We propose that the secondary structure of this downstream element optimizes its interaction with components of the polyadenylation complex.  相似文献   

8.
The yeast Saccharomyces cerevisiae ribosomal protein L30 negatively autoregulates its production by binding to a helix-loop-helix structure formed in its pre-mRNA and its mRNA. A three-dimensional solution structure of the L30 protein in complex with its regulatory RNA has been solved using NMR spectroscopy. In the complex, the helix-loop-helix RNA adopts a sharply bent conformation at the internal loop region. Unusual RNA features include a purine stack, a reverse Hoogsteen base pair (G11anti-G56syn) and highly distorted backbones. The L30 protein is folded in a three-layer alpha/beta/alpha sandwich topology, and three loops at one end of the sandwich make base-specific contacts with the RNA internal loop. The protein-RNA binding interface is divided into two clusters, including hydrophobic and aromatic stacking interactions centering around G56, and base-specific hydrogen-bonding contacts to A57, G58 and G10-U60 wobble base pair. Both the protein and the RNA exhibit a partially induced fit for binding, where loops in the protein and the internal loop in the RNA become more ordered upon complex formation. The specific interactions formed between loops on L30 and the internal loop on the mRNA constitute a novel loop-loop recognition motif where an intimate RNA-protein interface is formed between regions on both molecules that lack regular secondary structure.  相似文献   

9.
Herein, we report the development of a microarray platform to select RNA motif-ligand interactions that allows simultaneous screening of both RNA and chemical space. We used this platform to identify the RNA internal loops that bind 6'- N-5-hexynoate kanamycin A ( 1). Selected internal loops that bind 1 were studied in detail and commonly display an adenine across from a cytosine independent of the size of the loop. Additional preferences are also observed. For 3 x 3 nucleotide loops, there is a preference for purines, and for 2 x 2 nucleotide loops there is a preference for pyrimidines neighbored by an adenine across from a cytosine. This technique has several advantageous features for selecting RNA motif-ligand interactions: (1) higher affinity RNA motif-ligand interactions are identified by harvesting bound RNAs from lower ligand loadings; (2) bound RNAs are harvested from the array via gel extraction, mitigating kinetic biases in selections; and (3) multiple selections are completed on a single array surface. To further demonstrate that multiple selections can be completed in parallel on the same array surface, we selected the RNA internal loops from a 4096-member RNA internal loop library that bound a four-member aminoglycoside library. These experiments probed 16,384 (4 aminoglycoside x 4096-member RNA library) interactions in a single experiment. These studies allow for parallel screening of both chemical and RNA space to improve our understanding of RNA-ligand interactions. This information may facilitate the rational and modular design of small molecules targeting RNA.  相似文献   

10.
Three-way multibranch loops (junctions) are common in RNA secondary structures. Computer algorithms such as RNAstructure and MFOLD do not consider the identity of unpaired nucleotides in multibranch loops when predicting secondary structure. There is limited experimental data, however, to parametrize this aspect of these algorithms. In this study, UV optical melting and a fluorescence competition assay are used to measure stabilities of multibranch loops containing up to five unpaired adenosines or uridines or a loop E motif. These results provide a test of our understanding of the factors affecting multibranch loop stability and provide revised parameters for predicting stability. The results should help to improve predictions of RNA secondary structure.  相似文献   

11.
Lorenz WA  Clote P 《PloS one》2011,6(1):e16178
An RNA secondary structure is locally optimal if there is no lower energy structure that can be obtained by the addition or removal of a single base pair, where energy is defined according to the widely accepted Turner nearest neighbor model. Locally optimal structures form kinetic traps, since any evolution away from a locally optimal structure must involve energetically unfavorable folding steps. Here, we present a novel, efficient algorithm to compute the partition function over all locally optimal secondary structures of a given RNA sequence. Our software, RNAlocopt runs in O(n3) time and O(n2) space. Additionally, RNAlocopt samples a user-specified number of structures from the Boltzmann subensemble of all locally optimal structures. We apply RNAlocopt to show that (1) the number of locally optimal structures is far fewer than the total number of structures--indeed, the number of locally optimal structures approximately equal to the square root of the number of all structures, (2) the structural diversity of this subensemble may be either similar to or quite different from the structural diversity of the entire Boltzmann ensemble, a situation that depends on the type of input RNA, (3) the (modified) maximum expected accuracy structure, computed by taking into account base pairing frequencies of locally optimal structures, is a more accurate prediction of the native structure than other current thermodynamics-based methods. The software RNAlocopt constitutes a technical breakthrough in our study of the folding landscape for RNA secondary structures. For the first time, locally optimal structures (kinetic traps in the Turner energy model) can be rapidly generated for long RNA sequences, previously impossible with methods that involved exhaustive enumeration. Use of locally optimal structure leads to state-of-the-art secondary structure prediction, as benchmarked against methods involving the computation of minimum free energy and of maximum expected accuracy. Web server and source code available at http://bioinformatics.bc.edu/clotelab/RNAlocopt/.  相似文献   

12.
Accurate prediction of pseudoknotted nucleic acid secondary structure is an important computational challenge. Prediction algorithms based on dynamic programming aim to find a structure with minimum free energy according to some thermodynamic ("sum of loop energies") model that is implicit in the recurrences of the algorithm. However, a clear definition of what exactly are the loops in pseudoknotted structures, and their associated energies, has been lacking. In this work, we present a complete classification of loops in pseudoknotted nucleic secondary structures, and describe the Rivas and Eddy and other energy models as sum-of-loops energy models. We give a linear time algorithm for parsing a pseudoknotted secondary structure into its component loops. We give two applications of our parsing algorithm. The first is a linear time algorithm to calculate the free energy of a pseudoknotted secondary structure. This is useful for heuristic prediction algorithms, which are widely used since (pseudoknotted) RNA secondary structure prediction is NP-hard. The second application is a linear time algorithm to test the generality of the dynamic programming algorithm of Akutsu for secondary structure prediction.Together with previous work, we use this algorithm to compare the generality of state-of-the-art algorithms on real biological structures.  相似文献   

13.
Ribonuclic acid (RNA) enjoys increasing interest in molecular biology; despite this interest fundamental algorithms are lacking, e.g. for identifying local motifs. As proteins, RNA molecules have a distinctive structure. Therefore, in addition to sequence information, structure plays an important part in assessing the similarity of RNAs. Furthermore, common sequence-structure features in two or several RNA molecules are often only spatially local, where possibly large parts of the molecules are dissimilar. Consequently, we address the problem of comparing RNA molecules by computing an optimal local alignment with respect to sequence and structure information. While local alignment is superior to global alignment for identifying local similarities, no general local sequence-structure alignment algorithms are currently known. We suggest a new general definition of locality for sequence-structure alignments that is biologically motivated and efficiently tractable. To show the former, we discuss locality of RNA and prove that the defined locality means connectivity by atomic and non-atomic bonds. To show the latter, we present an efficient algorithm for the newly defined pairwise local sequence-structure alignment (lssa) problem for RNA. For molecules of lengthes n and m, the algorithm has worst-case time complexity of O(n2 x m2 x max(n,m)) and a space complexity of only O(n x m). An implementation of our algorithm is available at http://www.bio.inf.uni-jena.de. Its runtime is competitive with global sequence-structure alignment.  相似文献   

14.
The attachment sites of the primary binding proteins L1, L2 and L23 on 23 S ribosomal RNA of Escherichia coli were examined by a chemical and ribonuclease footprinting method using several probes with different specificities. The results show that the sites are confined to localized RNA regions within the large ribonuclease-protected ribonucleoprotein fragments that were characterized earlier. They are as follows: (1) L1 recognizes a tertiary structural motif in domain V centred on two interacting internal loops; the main protein interaction sites occur at the internal loop/helix junctions. (2) The L2 site constitutes a single irregular stem/loop structure in the centre of domain IV where non-Watson-Crick pairing is likely to occur. (3) L23 recognizes a tertiary structural motif involving a single terminal loop structure and part of an adjacent internal loop at the centre of domain III. Each of the three primary binding proteins, whose presence is essential for ribosomal assembly, has been associated with important ribosomal functions: L1 lies in the E-site for deacylated tRNA binding while L2 and L23 have been implicated in the P and A substrate sites, respectively, of the peptidyl transferase centre. Moreover, each of the protein sites, but particularly those of L2 and L23, lies at the centre of RNA domains where they can maximally influence both the assembly of secondary binding proteins and the function of the RNA region.  相似文献   

15.
MOTIVATION: S-attributed grammars (a generalization of classical Context-Free grammars) provide a versatile formalism for sequence analysis which allows to express long range constraints: the RNA folding problem is a typical example of application. Efficient algorithms have been developed to solve problems expressed with these tools, which generally compute the optimal attribute of the sequence w.r.t. the grammar. However, it is often more meaningful and/or interesting from the biological point of view to consider almost optimal attributes as well as approximate sequences; we thus need more flexible and powerful algorithms able to perform these generalized analyses. RESULTS: In this paper we present a basic algorithm which, given a grammar G and a sequence omega, computes the optimal attribute for all (approximate) strings omega(') in L(G) such that d(omega, omega(')) < or = M, and whose complexity is O(n(r + 1)) in time and O(n(2)) in space (r is the maximal length of the right-hand side of any production of G). We will also give some extensions and possible improvements of this algorithm.  相似文献   

16.
MOTIVATION: We describe algorithms implemented in a new software package, RNAbor, to investigate structures in a neighborhood of an input secondary structure S of an RNA sequence s. The input structure could be the minimum free energy structure, the secondary structure obtained by analysis of the X-ray structure or by comparative sequence analysis, or an arbitrary intermediate structure. RESULTS: A secondary structure T of s is called a delta-neighbor of S if T and S differ by exactly delta base pairs. RNAbor computes the number (N(delta)), the Boltzmann partition function (Z(delta)) and the minimum free energy (MFE(delta)) and corresponding structure over the collection of all delta-neighbors of S. This computation is done simultaneously for all delta < or = m, in run time O (mn3) and memory O(mn2), where n is the sequence length. We apply RNAbor for the detection of possible RNA conformational switches, and compare RNAbor with the switch detection method paRNAss. We also provide examples of how RNAbor can at times improve the accuracy of secondary structure prediction. AVAILABILITY: http://bioinformatics.bc.edu/clotelab/RNAbor/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

17.
PURPOSE: To investigate the importance of two possible mechanisms of tyrosine oxidation on the yield of protein dimerization. The model chosen is hen and turkey egg-white lysozymes, which differ by seven amino acids, among which one tyrosine is in the 3 position. MATERIALS AND METHODS: Aqueous solutions of proteins were oxidized by OH(*) or N(*)(3) free radicals produced by gamma or pulse irradiation in an atmosphere of N(2)O. Protein dimers were quantified by SDS-PAGE and reverse-phase HPLC. Dityrosines were identified by absorption and fluorescence. RESULTS: Using N(*)(3) free radicals, the initial yields of dimerization are equal to (8.6 +/- 0.7) x 10(-9) mol J(-1) for both proteins. Using OH(*) free radicals, they become equal to (1.23 +/- 0.1) x 10(-8) and (4.42 +/- 0.1) x 10(-8) mol J(-1) for hen and turkey egg-white lysozymes, respectively (gamma radiolysis). DISCUSSION. N(*)(3) radicals react primarily with tryptophan residues only. Tyrosine gets oxidized by intramolecular long-range electron migration, whereas OH(*) may react directly with tyrosines. We propose a low participation of Tyr3 in turkey protein in the intramolecular process, because Tyr3 is far from all tryptophans. On the other hand, Tyr3 is very accessible to solvent and in a flexible area; thus collisions with OH(*) could easily be followed by intermolecular dimerization.  相似文献   

18.
Dynamic programming algorithms that predict RNA secondary structure by minimizing the free energy have had one important limitation. They were able to predict only one optimal structure. Given the uncertainties of the thermodynamic data and the effects of proteins and other environmental factors on structure, the optimal structure predicted by these methods may not have biological significance. We present a dynamic programming algorithm that can determine optimal and suboptimal secondary structures for an RNA. The power and utility of the method is demonstrated in the folding of the intervening sequence of the rRNA of Tetrahymena. By first identifying the major secondary structures corresponding to the lowest free energy minima, a secondary structure of possible biological significance is derived.  相似文献   

19.
We present an algorithm that calculates the optimal binding conformation and free energy of two RNA molecules, one or both oligomeric. This algorithm has applications to modeling DNA microarrays, RNA splice-site recognitions and other antisense problems. Although other recent algorithms perform the same calculation in time proportional to the sum of the lengths cubed, O((N1 + N2)3), our oligomer binding algorithm, called bindigo, scales as the product of the sequence lengths, O(N1*N2). The algorithm performs well in practice with the aid of a heuristic for large asymmetric loops. To demonstrate its speed and utility, we use bindigo to investigate the binding proclivities of U1 snRNA to mRNA donor splice sites.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号