首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 891 毫秒
1.
Predicting RNA secondary structure is often the first step to determining the structure of RNA. Prediction approaches have historically avoided searching for pseudoknots because of the extreme combinatorial and time complexity of the problem. Yet neglecting pseudoknots limits the utility of such approaches. Here, an algorithm utilizing structure mapping and thermodynamics is introduced for RNA pseudoknot prediction that finds the minimum free energy and identifies information about the flexibility of the RNA. The heuristic approach takes advantage of the 5' to 3' folding direction of many biological RNA molecules and is consistent with the hierarchical folding hypothesis and the contact order model. Mapping methods are used to build and analyze the folded structure for pseudoknots and to add important 3D structural considerations. The program can predict some well known pseudoknot structures correctly. The results of this study suggest that many functional RNA sequences are optimized for proper folding. They also suggest directions we can proceed in the future to achieve even better results.  相似文献   

2.
The function of many RNAs depends crucially on their structure. Therefore, the design of RNA molecules with specific structural properties has many potential applications, e.g. in the context of investigating the function of biological RNAs, of creating new ribozymes, or of designing artificial RNA nanostructures. Here, we present a new algorithm for solving the following RNA secondary structure design problem: given a secondary structure, find an RNA sequence (if any) that is predicted to fold to that structure. Unlike the (pseudoknot-free) secondary structure prediction problem, this problem appears to be hard computationally. Our new algorithm, "RNA Secondary Structure Designer (RNA-SSD)", is based on stochastic local search, a prominent general approach for solving hard combinatorial problems. A thorough empirical evaluation on computationally predicted structures of biological sequences and artificially generated RNA structures as well as on empirically modelled structures from the biological literature shows that RNA-SSD substantially out-performs the best known algorithm for this problem, RNAinverse from the Vienna RNA Package. In particular, the new algorithm is able to solve structures, consistently, for which RNAinverse is unable to find solutions. The RNA-SSD software is publically available under the name of RNA Designer at the RNASoft website (www.rnasoft.ca).  相似文献   

3.
Computational tools for prediction of the secondary structure of two or more interacting nucleic acid molecules are useful for understanding mechanisms for ribozyme function, determining the affinity of an oligonucleotide primer to its target, and designing good antisense oligonucleotides, novel ribozymes, DNA code words, or nanostructures. Here, we introduce new algorithms for prediction of the minimum free energy pseudoknot-free secondary structure of two or more nucleic acid molecules, and for prediction of alternative low-energy (sub-optimal) secondary structures for two nucleic acid molecules. We provide a comprehensive analysis of our predictions against secondary structures of interacting RNA molecules drawn from the literature. Analysis of our tools on 17 sequences of up to 200 nucleotides that do not form pseudoknots shows that they have 79% accuracy, on average, for the minimum free energy predictions. When the best of 100 sub-optimal foldings is taken, the average accuracy increases to 91%. The accuracy decreases as the sequences increase in length and as the number of pseudoknots and tertiary interactions increases. Our algorithms extend the free energy minimization algorithm of Zuker and Stiegler for secondary structure prediction, and the sub-optimal folding algorithm by Wuchty et al. Implementations of our algorithms are freely available in the package MultiRNAFold.  相似文献   

4.
Algorithms for prediction of RNA secondary structure-the set of base pairs that form when an RNA molecule folds-are valuable to biologists who aim to understand RNA structure and function. Improving the accuracy and efficiency of prediction methods is an ongoing challenge, particularly for pseudoknotted secondary structures, in which base pairs overlap. This challenge is biologically important, since pseudoknotted structures play essential roles in functions of many RNA molecules, such as splicing and ribosomal frameshifting. State-of-the-art methods, which are based on free energy minimization, have high run-time complexity (typically Theta(n(5)) or worse), and can handle (minimize over) only limited types of pseudoknotted structures. We propose a new approach for prediction of pseudoknotted structures, motivated by the hypothesis that RNA structures fold hierarchically, with pseudoknot-free (non-overlapping) base pairs forming first, and pseudoknots forming later so as to minimize energy relative to the folded pseudoknot-free structure. Our HFold algorithm uses two-phase energy minimization to predict hierarchically formed secondary structures in O(n(3)) time, matching the complexity of the best algorithms for pseudoknot-free secondary structure prediction via energy minimization. Our algorithm can handle a wide range of biological structures, including kissing hairpins and nested kissing hairpins, which have previously required Theta(n(6)) time.  相似文献   

5.

Background

Ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Their secondary structures are crucial for the RNA functionality, and the prediction of the secondary structures is widely studied. Our previous research shows that cutting long sequences into shorter chunks, predicting secondary structures of the chunks independently using thermodynamic methods, and reconstructing the entire secondary structure from the predicted chunk structures can yield better accuracy than predicting the secondary structure using the RNA sequence as a whole. The chunking, prediction, and reconstruction processes can use different methods and parameters, some of which produce more accurate predictions than others. In this paper, we study the prediction accuracy and efficiency of three different chunking methods using seven popular secondary structure prediction programs that apply to two datasets of RNA with known secondary structures, which include both pseudoknotted and non-pseudoknotted sequences, as well as a family of viral genome RNAs whose structures have not been predicted before. Our modularized MapReduce framework based on Hadoop allows us to study the problem in a parallel and robust environment.

Results

On average, the maximum accuracy retention values are larger than one for our chunking methods and the seven prediction programs over 50 non-pseudoknotted sequences, meaning that the secondary structure predicted using chunking is more similar to the real structure than the secondary structure predicted by using the whole sequence. We observe similar results for the 23 pseudoknotted sequences, except for the NUPACK program using the centered chunking method. The performance analysis for 14 long RNA sequences from the Nodaviridae virus family outlines how the coarse-grained mapping of chunking and predictions in the MapReduce framework exhibits shorter turnaround times for short RNA sequences. However, as the lengths of the RNA sequences increase, the fine-grained mapping can surpass the coarse-grained mapping in performance.

Conclusions

By using our MapReduce framework together with statistical analysis on the accuracy retention results, we observe how the inversion-based chunking methods can outperform predictions using the whole sequence. Our chunk-based approach also enables us to predict secondary structures for very long RNA sequences, which is not feasible with traditional methods alone.
  相似文献   

6.
RNA pseudoknot prediction in energy-based models.   总被引:11,自引:0,他引:11  
RNA molecules are sequences of nucleotides that serve as more than mere intermediaries between DNA and proteins, e.g., as catalytic molecules. Computational prediction of RNA secondary structure is among the few structure prediction problems that can be solved satisfactorily in polynomial time. Most work has been done to predict structures that do not contain pseudoknots. Allowing pseudoknots introduces modeling and computational problems. In this paper we consider the problem of predicting RNA secondary structures with pseudoknots based on free energy minimization. We first give a brief comparison of energy-based methods for predicting RNA secondary structures with pseudoknots. We then prove that the general problem of predicting RNA secondary structures containing pseudoknots is NP complete for a large class of reasonable models of pseudoknots.  相似文献   

7.
As one of the earliest problems in computational biology, RNA secondary structure prediction (sometimes referred to as "RNA folding") problem has attracted attention again, thanks to the recent discoveries of many novel non-coding RNA molecules. The two common approaches to this problem are de novo prediction of RNA secondary structure based on energy minimization and the consensus folding approach (computing the common secondary structure for a set of unaligned RNA sequences). Consensus folding algorithms work well when the correct seed alignment is part of the input to the problem. However, seed alignment itself is a challenging problem for diverged RNA families. In this paper, we propose a novel framework to predict the common secondary structure for unaligned RNA sequences. By matching putative stacks in RNA sequences, we make use of both primary sequence information and thermodynamic stability for prediction at the same time. We show that our method can predict the correct common RNA secondary structures even when we are given only a limited number of unaligned RNA sequences, and it outperforms current algorithms in sensitivity and accuracy.  相似文献   

8.
9.
RNA molecules play integral roles in gene regulation, and understanding their structures gives us important insights into their biological functions. Despite recent developments in template-based and parameterized energy functions, the structure of RNA--in particular the nonhelical regions--is still difficult to predict. Knowledge-based potentials have proven efficient in protein structure prediction. In this work, we describe two differentiable knowledge-based potentials derived from a curated data set of RNA structures, with all-atom or coarse-grained representation, respectively. We focus on one aspect of the prediction problem: the identification of native-like RNA conformations from a set of near-native models. Using a variety of near-native RNA models generated from three independent methods, we show that our potential is able to distinguish the native structure and identify native-like conformations, even at the coarse-grained level. The all-atom version of our knowledge-based potential performs better and appears to be more effective at discriminating near-native RNA conformations than one of the most highly regarded parameterized potential. The fully differentiable form of our potentials will additionally likely be useful for structure refinement and/or molecular dynamics simulations.  相似文献   

10.
Fast evaluation of internal loops in RNA secondary structure prediction.   总被引:7,自引:0,他引:7  
MOTIVATION: Though not as abundant in known biological processes as proteins, RNA molecules serve as more than mere intermediaries between DNA and proteins. Research in the last 15 years demonstrates that RNA molecules serve in many roles, including catalysis. Furthermore, RNA secondary structure prediction based on free energy rules for stacking and loop formation remains one of the few major breakthroughs in the field of structure prediction, as minimum free energy structures and related quantities can be computed with full mathematical rigor. However, with the current energy parameters, the algorithms used hitherto suffer the disadvantage of either employing heuristics that risk (though highly unlikely) missing the optimal structure or becoming prohibitively time consuming for moderate to large sequences. RESULTS: We present a new method to evaluate internal loops utilizing currently used energy rules. This method reduces the time complexity of this part of the structure prediction from O(n4) to O(n3), thus reducing the overall complexity to O(n3). Even when the size of evaluated internal loops is bounded by k (a commonly used heuristic), the method presented has a competitive edge by reducing the time complexity of internal loop evaluation from O(k2n2) to O(kn2). The method also applies to the calculation of the equilibrium partition function. AVAILABILITY: Source code for an RNA secondary structure prediction program implementing this method is available at ftp://www.ibc.wustl.edu/pub/zuker/zuker .tar.Z  相似文献   

11.
RNA secondary structure is often predicted from sequence by free energy minimization. Over the past two years, advances have been made in the estimation of folding free energy change, the mapping of secondary structure and the implementation of computer programs for structure prediction. The trends in computer program development are: efficient use of experimental mapping of structures to constrain structure prediction; use of statistical mechanics to improve the fidelity of structure prediction; inclusion of pseudoknots in secondary structure prediction; and use of two or more homologous sequences to find a common structure.  相似文献   

12.
RNA secondary structure prediction using free energy minimization is one method to gain an approximation of structure. Constraints generated by enzymatic mapping or chemical modification can improve the accuracy of secondary structure prediction. We report a facile method that identifies single-stranded regions in RNA using short, randomized DNA oligonucleotides and RNase H cleavage. These regions are then used as constraints in secondary structure prediction. This method was used to improve the secondary structure prediction of Escherichia coli 5S rRNA. The lowest free energy structure without constraints has only 27% of the base pairs present in the phylogenetic structure. The addition of constraints from RNase H cleavage improves the prediction to 100% of base pairs. The same method was used to generate secondary structure constraints for yeast tRNAPhe, which is accurately predicted in the absence of constraints (95%). Although RNase H mapping does not improve secondary structure prediction, it does eliminate all other suboptimal structures predicted within 10% of the lowest free energy structure. The method is advantageous over other single-stranded nucleases since RNase H is functional in physiological conditions. Moreover, it can be used for any RNA to identify accessible binding sites for oligonucleotides or small molecules.  相似文献   

13.
With more and more ribonucleic acid (RNA) secondary structures accumulated, the need for comparing different RNA secondary structures often arises in function prediction and evolutionary analysis. Numerous efficient algorithms were developed for comparing different RNA secondary structures, but challenges remain. In this paper, six new models based on the linear regression model were proposed for the comparison of RNA secondary structures. The proposed models were tested on a mixed data, containing six secondary structures from RNase P RNAs, three secondary structures from SSU rRNA and five secondary structures from 16S ribosomal RNAs. The results have shown the effectiveness of the proposed models. Moreover, the time complexity of our models is favorable by comparing with that of the existing methods which solve the similar problem.  相似文献   

14.
Prediction of RNA secondary structure based on helical regions distribution   总被引:5,自引:0,他引:5  
MOTIVATION: RNAs play an important role in many biological processes and knowing their structure is important in understanding their function. Due to difficulties in the experimental determination of RNA secondary structure, the methods of theoretical prediction for known sequences are often used. Although many different algorithms for such predictions have been developed, this problem has not yet been solved. It is thus necessary to develop new methods for predicting RNA secondary structure. The most-used at present is Zuker's algorithm which can be used to determine the minimum free energy secondary structure. However many RNA secondary structures verified by experiments are not consistent with the minimum free energy secondary structures. In order to solve this problem, a method used to search a group of secondary structures whose free energy is close to the global minimum free energy was developed by Zuker in 1989. When considering a group of secondary structures, if there is no experimental data, we cannot tell which one is better than the others. This case also occurs in combinatorial and heuristic methods. These two kinds of methods have several weaknesses. Here we show how the central limit theorem can be used to solve these problems. RESULTS: An algorithm for predicting RNA secondary structure based on helical regions distribution is presented, which can be used to find the most probable secondary structure for a given RNA sequence. It consists of three steps. First, list all possible helical regions. Second, according to central limit theorem, estimate the occurrence probability of every helical region based on the Monte Carlo simulation. Third, add the helical region with the biggest probability to the current structure and eliminate the helical regions incompatible with the current structure. The above processes can be repeated until no more helical regions can be added. Take the current structure as the final RNA secondary structure. In order to demonstrate the confidence of the program, a test on three RNA sequences: tRNAPhe, Pre-tRNATyr, and Tetrahymena ribosomal RNA intervening sequence, is performed. AVAILABILITY: The program is written in Turbo Pascal 7.0. The source code is available upon request. CONTACT: Wujj@nic.bmi.ac.cn or Liwj@mail.bmi.ac.cn   相似文献   

15.
de Boer FK  Hogeweg P 《PloS one》2012,7(1):e29952
It is still not clear how prebiotic replicators evolved towards the complexity found in present day organisms. Within the most realistic scenario for prebiotic evolution, known as the RNA world hypothesis, such complexity has arisen from replicators consisting solely of RNA. Within contemporary life, remarkably many RNAs are involved in modifying other RNAs. In hindsight, such RNA-RNA modification might have helped in alleviating the limits of complexity posed by the information threshold for RNA-only replicators. Here we study the possible role of such self-modification in early evolution, by modeling the evolution of protocells as evolving replicators, which have the opportunity to incorporate these mechanisms as a molecular tool. Evolution is studied towards a set of 25 arbitrary 'functional' structures, while avoiding all other (misfolded) structures, which are considered to be toxic and increase the death-rate of a protocell. The modeled protocells contain a genotype of different RNA-sequences while their phenotype is the ensemble of secondary structures they can potentially produce from these RNA-sequences. One of the secondary structures explicitly codes for a simple sequence-modification tool. This 'RNA-adapter' can block certain positions on other RNA-sequences through antisense base-pairing. The altered sequence can produce an alternative secondary structure, which may or may not be functional. We show that the modifying potential of interacting RNA-sequences enables these protocells to evolve high fitness under high mutation rates. Moreover, our model shows that because of toxicity of misfolded molecules, redundant coding impedes the evolution of self-modification machinery, in effect restraining the evolvability of coding structures. Hence, high mutation rates can actually promote the evolution of complex coding structures by reducing redundant coding. Protocells can successfully use RNA-adapters to modify their genotype-phenotype mapping in order to enhance the coding capacity of their genome and fit more information on smaller sized genomes.  相似文献   

16.
王金华  骆志刚  管乃洋  严繁妹  靳新  张雯 《遗传》2007,29(7):889-897
多数RNA分子的结构在进化中是高度保守的, 其中很多包含伪结。而RNA伪结的预测一直是一个棘手问题, 很多RNA 二级结构预测算法都不能预测伪结。文章提出一种基于迭代法预测带伪结RNA 二级结构的新方法。该方法在给潜在碱基对打分时综合了热力学和协变信息, 通过基于最小自由能RNA折叠算法的多次迭代选出所有的碱基对。测试结果表明: 此方法几乎能预测到所有的伪结。与其他方法相比, 敏感度接近最优, 而特异性达到最优。  相似文献   

17.
MOTIVATION: Several algorithms have been developed for drawing RNA secondary structures, however none of these can be used to draw RNA pseudoknot structures. In the sense of graph theory, a drawing of RNA secondary structures is a tree, whereas a drawing of RNA pseudoknots is a graph with inner cycles within a pseudoknot as well as possible outer cycles formed between a pseudoknot and other structural elements. Thus, RNA pseudoknots are more difficult to visualize than RNA secondary structures. Since no automatic method for drawing RNA pseudoknots exists, visualizing RNA pseudoknots relies on significant amount of manual work and does not yield satisfactory results. The task of visualizing RNA pseudoknots by hand becomes more challenging as the size and complexity of the RNA pseudoknots increase. RESULTS: We have developed a new representation and an algorithm for drawing H-type pseudoknots with RNA secondary structures. Compared to existing representations of H-type pseudoknots, the new representation ensures uniform and clear drawings with no edge crossing for any H-type pseudoknots. To the best of our knowledge, this is the first algorithm for automatically drawing RNA pseudoknots with RNA secondary structures. The algorithm has been implemented in a Java program, which can be executed on any computing system. Experimental results demonstrate that the algorithm generates an aesthetically pleasing drawing of all H-type pseudoknots. The results have also shown that the drawing has high readability, enabling the user to quickly and easily recognize the whole RNA structure as well as the pseudoknots themselves.  相似文献   

18.
RNA–RNA binding is an important phenomenon observed for many classes of non-coding RNAs and plays a crucial role in a number of regulatory processes. Recently several MFE folding algorithms for predicting the joint structure of two interacting RNA molecules have been proposed. Here joint structure means that in a diagram representation the intramolecular bonds of each partner are pseudoknot-free, that the intermolecular binding pairs are noncrossing, and that there is no so-called “zigzag” configuration. This paper presents the combinatorics of RNA interaction structures including their generating function, singularity analysis as well as explicit recurrence relations. In particular, our results imply simple asymptotic formulas for the number of joint structures.  相似文献   

19.
RNA secondary-structure folding algorithms predict the existence of connected networks of RNA sequences with identical secondary structures. Fitness landscapes that are based on the mapping between RNA sequence and RNA secondary structure hence have many neutral paths. A neutral walk on these fitness landscapes gives access to a virtually unlimited number of secondary structures that are a single point mutation from the neutral path. This shows that neutral evolution explores phenotype space and can play a role in adaptation. Received: 23 December 1995 / Accepted: 17 March 1996  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号