首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
MOTIVATION: Searching genomes for non-coding RNAs (ncRNAs) by their secondary structure has become an important goal for bioinformatics. For pseudoknot-free structures, ncRNA search can be effective based on the covariance model and CYK-type dynamic programming. However, the computational difficulty in aligning an RNA sequence to a pseudoknot has prohibited fast and accurate search of arbitrary RNA structures. Our previous work introduced a graph model for RNA pseudoknots and proposed to solve the structure-sequence alignment by graph optimization. Given k candidate regions in the target sequence for each of the n stems in the structure, we could compute a best alignment in time O(k(t)n) based upon a tree width t decomposition of the structure graph. However, to implement this method to programs that can routinely perform fast yet accurate RNA pseudoknot searches, we need novel heuristics to ensure that, without degrading the accuracy, only a small number of stem candidates need to be examined and a tree decomposition of a small tree width can always be found for the structure graph. RESULTS: The current work builds on the previous one with newly developed preprocessing algorithms to reduce the values for parameters k and t and to implement the search method into a practical program, called RNATOPS, for RNA pseudoknot search. In particular, we introduce techniques, based on probabilistic profiling and distance penalty functions, which can identify for every stem just a small number k (e.g. k 相似文献   

3.
王金华  骆志刚  管乃洋  严繁妹  靳新  张雯 《遗传》2007,29(7):889-897
多数RNA分子的结构在进化中是高度保守的, 其中很多包含伪结。而RNA伪结的预测一直是一个棘手问题, 很多RNA 二级结构预测算法都不能预测伪结。文章提出一种基于迭代法预测带伪结RNA 二级结构的新方法。该方法在给潜在碱基对打分时综合了热力学和协变信息, 通过基于最小自由能RNA折叠算法的多次迭代选出所有的碱基对。测试结果表明: 此方法几乎能预测到所有的伪结。与其他方法相比, 敏感度接近最优, 而特异性达到最优。  相似文献   

4.
RNA secondary structure is often predicted from sequence by free energy minimization. Over the past two years, advances have been made in the estimation of folding free energy change, the mapping of secondary structure and the implementation of computer programs for structure prediction. The trends in computer program development are: efficient use of experimental mapping of structures to constrain structure prediction; use of statistical mechanics to improve the fidelity of structure prediction; inclusion of pseudoknots in secondary structure prediction; and use of two or more homologous sequences to find a common structure.  相似文献   

5.
Gupta A  Rahman R  Li K  Gribskov M 《RNA biology》2012,9(2):187-199
The close relationship between RNA structure and function underlines the significance of accurately predicting RNA structures from sequence information. Structural topologies such as pseudoknots are of particular interest due to their ubiquity and direct involvement in RNA function, but identifying pseudoknots is a computationally challenging problem and existing heuristic approaches usually perform poorly for RNA sequences of even a few hundred bases. We survey the performance of pseudoknot prediction methods on a data set of full-length RNA sequences representing varied sequence lengths, and biological RNA classes such as RNase P RNA, Group I Intron, tmRNA and tRNA. Pseudoknot prediction methods are compared with minimum free energy and suboptimal secondary structure prediction methods in terms of correct base-pairs, stems and pseudoknots and we find that the ensemble of suboptimal structure predictions succeeds in identifying correct structural elements in RNA that are usually missed in MFE and pseudoknot predictions. We propose a strategy to identify a comprehensive set of non-redundant stems in the suboptimal structure space of a RNA molecule by applying heuristics that reduce the structural redundancy of the predicted suboptimal structures by merging slightly varying stems that are predicted to form in local sequence regions. This reduced-redundancy set of structural elements consistently outperforms more specialized approaches.in data sets. Thus, the suboptimal folding space can be used to represent the structural diversity of an RNA molecule more comprehensively than optimal structure prediction approaches alone.  相似文献   

6.
MOTIVATION: Several algorithms have been developed for drawing RNA secondary structures, however none of these can be used to draw RNA pseudoknot structures. In the sense of graph theory, a drawing of RNA secondary structures is a tree, whereas a drawing of RNA pseudoknots is a graph with inner cycles within a pseudoknot as well as possible outer cycles formed between a pseudoknot and other structural elements. Thus, RNA pseudoknots are more difficult to visualize than RNA secondary structures. Since no automatic method for drawing RNA pseudoknots exists, visualizing RNA pseudoknots relies on significant amount of manual work and does not yield satisfactory results. The task of visualizing RNA pseudoknots by hand becomes more challenging as the size and complexity of the RNA pseudoknots increase. RESULTS: We have developed a new representation and an algorithm for drawing H-type pseudoknots with RNA secondary structures. Compared to existing representations of H-type pseudoknots, the new representation ensures uniform and clear drawings with no edge crossing for any H-type pseudoknots. To the best of our knowledge, this is the first algorithm for automatically drawing RNA pseudoknots with RNA secondary structures. The algorithm has been implemented in a Java program, which can be executed on any computing system. Experimental results demonstrate that the algorithm generates an aesthetically pleasing drawing of all H-type pseudoknots. The results have also shown that the drawing has high readability, enabling the user to quickly and easily recognize the whole RNA structure as well as the pseudoknots themselves.  相似文献   

7.
Computational tools for prediction of the secondary structure of two or more interacting nucleic acid molecules are useful for understanding mechanisms for ribozyme function, determining the affinity of an oligonucleotide primer to its target, and designing good antisense oligonucleotides, novel ribozymes, DNA code words, or nanostructures. Here, we introduce new algorithms for prediction of the minimum free energy pseudoknot-free secondary structure of two or more nucleic acid molecules, and for prediction of alternative low-energy (sub-optimal) secondary structures for two nucleic acid molecules. We provide a comprehensive analysis of our predictions against secondary structures of interacting RNA molecules drawn from the literature. Analysis of our tools on 17 sequences of up to 200 nucleotides that do not form pseudoknots shows that they have 79% accuracy, on average, for the minimum free energy predictions. When the best of 100 sub-optimal foldings is taken, the average accuracy increases to 91%. The accuracy decreases as the sequences increase in length and as the number of pseudoknots and tertiary interactions increases. Our algorithms extend the free energy minimization algorithm of Zuker and Stiegler for secondary structure prediction, and the sub-optimal folding algorithm by Wuchty et al. Implementations of our algorithms are freely available in the package MultiRNAFold.  相似文献   

8.
MOTIVATION: Since the whole genome sequences of many species have been determined, computational prediction of RNA secondary structures and computational identification of those non-coding RNA regions by comparative genomics become important. Therefore, more advanced alignment methods are required. Recently, an approach of structural alignment for RNA sequences has been introduced to solve these problems. Pair hidden Markov models on tree structures (PHMMTSs) proposed by Sakakibara are efficient automata-theoretic models for structural alignment of RNA secondary structures, although PHMMTSs are incapable of handling pseudoknots. On the other hand, tree adjoining grammars (TAGs), a subclass of context-sensitive grammars, are suitable for modeling pseudoknots. Our goal is to extend PHMMTSs by incorporating TAGs to be able to handle pseudoknots. RESULTS: We propose pair stochastic TAGs (PSTAGs) for aligning and predicting RNA secondary structures including a simple type of pseudoknot which can represent most known pseudoknot structures. First, we extend PHMMTSs defined on alignment of 'trees' to PSTAGs defined on alignment of 'TAG trees' which represent derivation processes of TAGs and are functionally equivalent to derived trees of TAGs. Then, we develop an efficient dynamic programming algorithm of PSTAGs for obtaining an optimal structural alignment including pseudoknots. We implement the PSTAG algorithm and demonstrate the properties of the algorithm by using it to align and predict several small pseudoknot structures. We believe that our implemented program based on PSTAGs is the first grammar-based and practically executable software for comparative analyses of RNA pseudoknot structures, and, further, non-coding RNAs.  相似文献   

9.
RNA pseudoknot prediction in energy-based models.   总被引:11,自引:0,他引:11  
RNA molecules are sequences of nucleotides that serve as more than mere intermediaries between DNA and proteins, e.g., as catalytic molecules. Computational prediction of RNA secondary structure is among the few structure prediction problems that can be solved satisfactorily in polynomial time. Most work has been done to predict structures that do not contain pseudoknots. Allowing pseudoknots introduces modeling and computational problems. In this paper we consider the problem of predicting RNA secondary structures with pseudoknots based on free energy minimization. We first give a brief comparison of energy-based methods for predicting RNA secondary structures with pseudoknots. We then prove that the general problem of predicting RNA secondary structures containing pseudoknots is NP complete for a large class of reasonable models of pseudoknots.  相似文献   

10.
The paper investigates the computational problem of predicting RNA secondary structures. The general belief is that allowing pseudoknots makes the problem hard. Existing polynomial-time algorithms are heuristic algorithms with no performance guarantee and can handle only limited types of pseudoknots. In this paper, we initiate the study of predicting RNA secondary structures with a maximum number of stacking pairs while allowing arbitrary pseudoknots. We obtain two approximation algorithms with worst-case approximation ratios of 1/2 and 1/3 for planar and general secondary structures, respectively. For an RNA sequence of n bases, the approximation algorithm for planar secondary structures runs in O(n(3)) time while that for the general case runs in linear time. Furthermore, we prove that allowing pseudoknots makes it NP-hard to maximize the number of stacking pairs in a planar secondary structure. This result is in contrast with the recent NP-hard results on psuedoknots which are based on optimizing some general and complicated energy functions.  相似文献   

11.
Pseudoknots are an essential feature of RNA tertiary structures. Simple H-type pseudoknots have been studied extensively in terms of biological functions, computational prediction, and energy models. Intramolecular kissing hairpins are a more complex and biologically important type of pseudoknot in which two hairpin loops form base pairs. They are hard to predict using free energy minimization due to high computational requirements. Heuristic methods that allow arbitrary pseudoknots strongly depend on the quality of energy parameters, which are not yet available for complex pseudoknots. We present an extension of the heuristic pseudoknot prediction algorithm DotKnot, which covers H-type pseudoknots and intramolecular kissing hairpins. Our framework allows for easy integration of advanced H-type pseudoknot energy models. For a test set of RNA sequences containing kissing hairpins and other types of pseudoknot structures, DotKnot outperforms competing methods from the literature. DotKnot is available as a web server under http://dotknot.csse.uwa.edu.au.  相似文献   

12.
Algorithms for prediction of RNA secondary structure-the set of base pairs that form when an RNA molecule folds-are valuable to biologists who aim to understand RNA structure and function. Improving the accuracy and efficiency of prediction methods is an ongoing challenge, particularly for pseudoknotted secondary structures, in which base pairs overlap. This challenge is biologically important, since pseudoknotted structures play essential roles in functions of many RNA molecules, such as splicing and ribosomal frameshifting. State-of-the-art methods, which are based on free energy minimization, have high run-time complexity (typically Theta(n(5)) or worse), and can handle (minimize over) only limited types of pseudoknotted structures. We propose a new approach for prediction of pseudoknotted structures, motivated by the hypothesis that RNA structures fold hierarchically, with pseudoknot-free (non-overlapping) base pairs forming first, and pseudoknots forming later so as to minimize energy relative to the folded pseudoknot-free structure. Our HFold algorithm uses two-phase energy minimization to predict hierarchically formed secondary structures in O(n(3)) time, matching the complexity of the best algorithms for pseudoknot-free secondary structure prediction via energy minimization. Our algorithm can handle a wide range of biological structures, including kissing hairpins and nested kissing hairpins, which have previously required Theta(n(6)) time.  相似文献   

13.
Dynamic programming algorithms that predict RNA secondary structure by minimizing the free energy have had one important limitation. They were able to predict only one optimal structure. Given the uncertainties of the thermodynamic data and the effects of proteins and other environmental factors on structure, the optimal structure predicted by these methods may not have biological significance. We present a dynamic programming algorithm that can determine optimal and suboptimal secondary structures for an RNA. The power and utility of the method is demonstrated in the folding of the intervening sequence of the rRNA of Tetrahymena. By first identifying the major secondary structures corresponding to the lowest free energy minima, a secondary structure of possible biological significance is derived.  相似文献   

14.
We present TT2NE, a new algorithm to predict RNA secondary structures with pseudoknots. The method is based on a classification of RNA structures according to their topological genus. TT2NE is guaranteed to find the minimum free energy structure regardless of pseudoknot topology. This unique proficiency is obtained at the expense of the maximum length of sequences that can be treated, but comparison with state-of-the-art algorithms shows that TT2NE significantly improves the quality of predictions. Analysis of TT2NE's incorrect predictions sheds light on the need to study how sterical constraints limit the range of pseudoknotted structures that can be formed from a given sequence. An implementation of TT2NE on a public server can be found at http://ipht.cea.fr/rna/tt2ne.php.  相似文献   

15.
The field of RNA structure prediction has experienced significant advances in the past several years, thanks to the availability of new experimental data and improved computational methodologies. These methods determine RNA secondary structures and pseudoknots from sequence alignments, thermodynamics-based dynamic programming algorithms, genetic algorithms and combined approaches. Computational RNA three-dimensional modeling uses this information in conjunction with manual manipulation, constraint satisfaction methods, molecular mechanics and molecular dynamics. The ultimate goal of automatically producing RNA three-dimensional models from given secondary and tertiary structure data, however, is still not fully realized. Recent developments in the computational prediction of RNA structure have helped bridge the gap between RNA secondary structure prediction, including pseudoknots, and three-dimensional modeling of RNA.  相似文献   

16.
本文在最大权重匹配(Maximum Weighted Matching,MWM)算法的基顾础上引入与茎区长度相关的动态权重,采用一种递归算法逐步寻找具有最大权重和的茎区。从而最终确定RNA的二级结构.该算法避开了繁杂的自由能计算,同样也能达到较高的预测精确度并且还能预测到大多数类型的潜在假结(pseudoknots).  相似文献   

17.
The algorithm and the program for the prediction of RNA secondary structure with pseudoknot formation have been proposed. The algorithm simulates stepwise folding by generating random structures using Monte Carlo method, followed by the selection of helices to final structure on the basis of both their probabilities of occurrence in a random structure and free energy parameters. The program versions have been tested on ribosomal RNA structures and on RNAs with pseudoknots evidenced by experimental data. It is shown that the simulation of folding during RNA synthesis improves the results. The introduction of pseudoknot formation permits to predict the pseudoknotted structures and to improve the prediction of long-range interactions. The computer program is rather fast and allows to predict the structures for long RNAs without using large memory volumes in usual personal computer.  相似文献   

18.
This article describes the latest version of an RNA folding algorithm that predicts both optimal and suboptimal solutions based on free energy minimization. A number of RNA's with known structures deduced from comparative sequence analysis are folded to test program performance. The group of solutions obtained for each molecule is analysed to determine how many of the known helixes occur in the optimal solution and in the best suboptimal solution. In most cases, a structure about 80% correct is found with a free energy within 2% of the predicted lowest free energy structure.  相似文献   

19.
It is a significant challenge to predict RNA secondary structures including pseudoknots. Here, a new algorithm capable of predicting pseudoknots of any topology, ProbKnot, is reported. ProbKnot assembles maximum expected accuracy structures from computed base-pairing probabilities in O(N2) time, where N is the length of the sequence. The performance of ProbKnot was measured by comparing predicted structures with known structures for a large database of RNA sequences with fewer than 700 nucleotides. The percentage of known pairs correctly predicted was 69.3%. Additionally, the percentage of predicted pairs in the known structure was 61.3%. This performance is the highest of four tested algorithms that are capable of pseudoknot prediction. The program is available for download at: http://rna.urmc.rochester.edu/RNAstructure.html.  相似文献   

20.
The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号