首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Abstract

The process of designing novel RNA sequences by inverse RNA folding, available in tools such as RNAinverse and InfoRNA, can be thought of as a reconstruction of RNAs from secondary structure. In this reconstruction problem, no physical measures are considered as additional constraints that are independent of structure, aside of the goal to reach the same secondary structure as the input using energy minimization methods. An extension of the reconstruction problem can be formulated since in many cases of natural RNAs, it is desired to analyze the sequence and structure of RNA molecules using various physical quantifiable measures. In prior works that used secondary structure predictions, it has been shown that natural RNAs differ significantly from random RNAs in some of these measures. Thus, we relax the problem of reconstructing RNAs from secondary structure into reconstructing RNAs from shapes, and in turn incorporate physical quantities as constraints. This allows for the design of novel RNA sequences by inverse folding while considering various physical quantities of interest such as thermodynamic stability, mutational robustness, and linguistic complexity. At the expense of altering the number of nucleotides in stems and loops, for example, physical measures can be taken into account. We use evolutionary computation for the new reconstruction problem and illustrate the procedure on various natural RNAs.  相似文献   

2.
As the raw material for evolution, arbitrary RNA sequences represent the baseline for RNA structure formation and a standard to which evolved structures can be compared. Here, we set out to probe, using physical and chemical methods, the structural properties of RNAs having randomly generated oligonucleotide sequences that were of sufficient length and information content to encode complex, functional folds, yet were unbiased by either genealogical or functional constraints. Typically, these unevolved, nonfunctional RNAs had sequence-specific secondary structure configurations and compact magnesium-dependent conformational states comparable to those of evolved RNA isolates. But unlike evolved sequences, arbitrary sequences were prone to having multiple competing conformations. Thus, for RNAs the size of small ribozymes, natural selection seems necessary to achieve uniquely folding sequences, but not to account for the well-ordered secondary structures and overall compactness observed in nature.  相似文献   

3.
As one of the earliest problems in computational biology, RNA secondary structure prediction (sometimes referred to as "RNA folding") problem has attracted attention again, thanks to the recent discoveries of many novel non-coding RNA molecules. The two common approaches to this problem are de novo prediction of RNA secondary structure based on energy minimization and the consensus folding approach (computing the common secondary structure for a set of unaligned RNA sequences). Consensus folding algorithms work well when the correct seed alignment is part of the input to the problem. However, seed alignment itself is a challenging problem for diverged RNA families. In this paper, we propose a novel framework to predict the common secondary structure for unaligned RNA sequences. By matching putative stacks in RNA sequences, we make use of both primary sequence information and thermodynamic stability for prediction at the same time. We show that our method can predict the correct common RNA secondary structures even when we are given only a limited number of unaligned RNA sequences, and it outperforms current algorithms in sensitivity and accuracy.  相似文献   

4.
In addition to characteristic structural properties imposed by evolutionary modification, evolved, single-stranded RNAs also display characteristic structural properties imposed by intrinsic physical constraints on RNA polymer folding. The balance of intrinsic and functionally selected characters in the folded conformation of evolved secondary structures was determined by comparing the predicted secondary structures of evolved and unevolved (random) RNA sequences. Though evolved conformations are significantly more ordered than conformations of random-sequence RNA, this analysis demonstrates that the majority of conformational order within evolved structures results not from evolutionary optimization but from constraints imposed by rules intrinsic to RNA polymer folding. Received: 25 November 1998 / Accepted: 12 February 1999  相似文献   

5.
MOTIVATION: Most non-coding RNAs are characterized by a specific secondary and tertiary structure that determines their function. Here, we investigate the folding energy of the secondary structure of non-coding RNA sequences, such as microRNA precursors, transfer RNAs and ribosomal RNAs in several eukaryotic taxa. Statistical biases are assessed by a randomization test, in which the predicted minimum free energy of folding is compared with values obtained for structures inferred from randomly shuffling the original sequences. RESULTS: In contrast with transfer RNAs and ribosomal RNAs, the majority of the microRNA sequences clearly exhibit a folding free energy that is considerably lower than that for shuffled sequences, indicating a high tendency in the sequence towards a stable secondary structure. A possible usage of this statistical test in the framework of the detection of genuine miRNA sequences is discussed.  相似文献   

6.
RNA molecules, which are found in all living cells, fold into characteristic structures that account for their diverse functional activities. Many of these RNA structures consist of a collection of fundamental RNA motifs. The various combinations of RNA basic components form different RNA classes and define their unique structural and functional properties. The availability of many genome sequences makes it possible to search computationally for functional RNAs. Biological experiments indicate that functional RNAs have characteristic RNA structural motifs represented by specific combinations of base pairings and conserved nucleotides in the loop regions. The searching for those well-ordered RNA structures and their homologues in genomic sequences is very helpful for the understanding of RNA-based gene regulation. In this paper, we consider the following problem: given an RNA sequence with a known secondary structure, efficiently determine candidate segments in genomic sequences that can potentially form RNA secondary structures similar to the given RNA secondary structure. Our new bottom-up approach searches all potential stem-loops similar to ones of the given RNA secondary structure first, and then based on located stem-loops, detects potential homologous structural RNAs in genomic sequences.  相似文献   

7.
RNA molecules with structure dependent functions are uniquely folded   总被引:3,自引:3,他引:0  
  相似文献   

8.
Comparative sequence analysis addresses the problem of RNA folding and RNA structural diversity, and is responsible for determining the folding of many RNA molecules, including 5S, 16S, and 23S rRNAs, tRNA, RNAse P RNA, and Group I and II introns. Initially this method was utilized to fold these sequences into their secondary structures. More recently, this method has revealed numerous tertiary correlations, elucidating novel RNA structural motifs, several of which have been experimentally tested and verified, substantiating the general application of this approach. As successful as the comparative methods have been in elucidating higher-order structure, it is clear that additional structure constraints remain to be found. Deciphering such constraints requires more sensitive and rigorous protocols, in addition to RNA sequence datasets that contain additional phylogenetic diversity and an overall increase in the number of sequences. Various RNA databases, including the tRNA and rRNA sequence datasets, continue to grow in number as well as diversity. Described herein is the development of more rigorous comparative analysis protocols. Our initial development and applications on different RNA datasets have been very encouraging. Such analyses on tRNA, 16S and 23S rRNA are substantiating previously proposed associations and are now beginning to reveal additional constraints on these molecules. A subset of these involve several positions that correlate simultaneously with one another, implying units larger than a basepair can be under a phylogenetic constraint.  相似文献   

9.
Functional RNAs can fold into intricate structures using a number of different secondary and tertiary structural motifs. Many factors contribute to the overall free energy of the target fold. This study aims at quantifying the entropic costs coming from the loss of conformational freedom when the sugar-phosphate backbone is subjected to constraints imposed by secondary and tertiary contacts. Motivated by insights from topology theory, we design a diagrammatic scheme to represent different types of RNA structures so that constraints associated with a folded structure may be segregated into mutually independent subsets, enabling the total conformational entropy loss to be easily calculated as a sum of independent terms. We used high-throughput Monte Carlo simulations to simulate large ensembles of single-stranded RNA sequences in solution to validate the assumptions behind our diagrammatic scheme, examining the entropic costs for hairpin initiation and formation of many multiway junctions. Our diagrammatic scheme aids in the factorization of secondary/tertiary constraints into distinct topological classes and facilitates the discovery of interrelationships among multiple constraints on RNA folds. This perspective, which to our knowledge is novel, leads to useful insights into the inner workings of some functional RNA sequences, demonstrating how they might operate by transforming their structures among different topological classes.  相似文献   

10.
11.
Among all of the biological macromolecules, the functional versatility of RNAs is unique including encoding or transferring genetic information and performing catalysis. These biological functions are highly dependent upon RNA folding and structure. Since the discovery of catalytic RNAs in the early 1980s, a recent breakthrough came from the identification of a wealth of micro RNAs, small interfering RNAs and regulatory RNAs, all involved in modulation of gene expression. The structure of these novel RNAs, either free or in complex with specific ligands, can be analyzed using various experimental strategies, including X-ray crystallography, cryo-electron microscopy, nuclear magnetic resonance spectroscopy, structure-specific probes, with some that can be used in living cells, RNA engineering, thermal denaturation and mass spectrometry. Among these, X-ray crystallography has recently enabled determination of the structures of several large and complex RNAs, as well as of ribonucleoprotein complexes. The database of RNA structure has grown tremendously since the recent crystal structure analyses of the prokaryotic ribosome and its subunits. These methods are now widely applied to a variety of biologically relevant RNAs.  相似文献   

12.
How RNA folds.   总被引:9,自引:0,他引:9  
We describe the RNA folding problem and contrast it with the much more difficult protein folding problem. RNA has four similar monomer units, whereas proteins have 20 very different residues. The folding of RNA is hierarchical in that secondary structure is much more stable than tertiary folding. In RNA the two levels of folding (secondary and tertiary) can be experimentally separated by the presence or absence of Mg2+. Secondary structure can be predicted successfully from experimental thermodynamic data on secondary structure elements: helices, loops, and bulges. Tertiary interactions can then be added without much distortion of the secondary structure. These observations suggest a folding algorithm to predict the structure of an RNA from its sequence. However, to solve the RNA folding problem one needs thermodynamic data on tertiary structure interactions, and identification and characterization of metal-ion binding sites. These data, together with force versus extension measurements on single RNA molecules, should provide the information necessary to test and refine the proposed algorithm.  相似文献   

13.

Background

Ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Their secondary structures are crucial for the RNA functionality, and the prediction of the secondary structures is widely studied. Our previous research shows that cutting long sequences into shorter chunks, predicting secondary structures of the chunks independently using thermodynamic methods, and reconstructing the entire secondary structure from the predicted chunk structures can yield better accuracy than predicting the secondary structure using the RNA sequence as a whole. The chunking, prediction, and reconstruction processes can use different methods and parameters, some of which produce more accurate predictions than others. In this paper, we study the prediction accuracy and efficiency of three different chunking methods using seven popular secondary structure prediction programs that apply to two datasets of RNA with known secondary structures, which include both pseudoknotted and non-pseudoknotted sequences, as well as a family of viral genome RNAs whose structures have not been predicted before. Our modularized MapReduce framework based on Hadoop allows us to study the problem in a parallel and robust environment.

Results

On average, the maximum accuracy retention values are larger than one for our chunking methods and the seven prediction programs over 50 non-pseudoknotted sequences, meaning that the secondary structure predicted using chunking is more similar to the real structure than the secondary structure predicted by using the whole sequence. We observe similar results for the 23 pseudoknotted sequences, except for the NUPACK program using the centered chunking method. The performance analysis for 14 long RNA sequences from the Nodaviridae virus family outlines how the coarse-grained mapping of chunking and predictions in the MapReduce framework exhibits shorter turnaround times for short RNA sequences. However, as the lengths of the RNA sequences increase, the fine-grained mapping can surpass the coarse-grained mapping in performance.

Conclusions

By using our MapReduce framework together with statistical analysis on the accuracy retention results, we observe how the inversion-based chunking methods can outperform predictions using the whole sequence. Our chunk-based approach also enables us to predict secondary structures for very long RNA sequences, which is not feasible with traditional methods alone.
  相似文献   

14.
A linear segment in which a number of pairs of intervals of equal length are identified as potential stems is the subject of a folding problem analogous to inference of RNA secondary structure. A quantity of free energy (or equivalently, energy per unit length) is associated with each stem, and the various types of loops are assigned energy costs as a function of their lengths. Inference of stable structures can then be carried out in the same way as in RNA folding. More important, perturbation of stem lengths and energy densities (modelling various mutational processes affecting nucleotide sequences) allows the delineation of domains of stability of various foldings, through the explicit calculation of their boundaries, in a low-dimensional parameter space.  相似文献   

15.
Functional RNA regions are often related to recurrent secondary structure patterns (or motifs), which can exert their role in several different ways, particularly in dictating the interaction with RNA-binding proteins, and acting in the regulation of a large number of cellular processes. Among the available motif-finding tools, the majority focuses on sequence patterns, sometimes including secondary structure as additional constraints to improve their performance. Nonetheless, secondary structures motifs may be concurrent to their sequence counterparts or even encode a stronger functional signal. Current methods for searching structural motifs generally require long pipelines and/or high computational efforts or previously aligned sequences. Here, we present BEAM (BEAr Motif finder), a novel method for structural motif discovery from a set of unaligned RNAs, taking advantage of a recently developed encoding for RNA secondary structure named BEAR (Brand nEw Alphabet for RNAs) and of evolutionary substitution rates of secondary structure elements. Tested in a varied set of scenarios, from small- to large-scale, BEAM is successful in retrieving structural motifs even in highly noisy data sets, such as those that can arise in CLIP-Seq or other high-throughput experiments.  相似文献   

16.
Ndifon W 《Bio Systems》2005,82(3):257-265
The kinetic folding of RNA sequences into secondary structures is modeled as a complex adaptive system, the components of which are possible RNA structural rearrangements (SRs) and their associated bases and base pairs. RNA bases and base pairs engage in local stacking interactions that determine the probabilities (or fitnesses) of possible SRs. Meanwhile, selection operates at the level of SRs; an autonomous stochastic process periodically (i.e., from one time step to another) selects a subset of possible SRs for realization based on the fitnesses of the SRs. Using examples based on selected natural and synthetic RNAs, the model is shown to reproduce characteristic (nonlinear) RNA folding dynamics such as the attainment by RNAs of alternative stable states. Possible applications of the model to the analysis of properties of fitness landscapes, and of the RNA sequence-to-structure mapping are discussed.  相似文献   

17.

Background  

RNA exhibits a variety of structural configurations. Here we consider a structure to be tantamount to the noncrossing Watson-Crick and G-U-base pairings (secondary structure) and additional cross-serial base pairs. These interactions are called pseudoknots and are observed across the whole spectrum of RNA functionalities. In the context of studying natural RNA structures, searching for new ribozymes and designing artificial RNA, it is of interest to find RNA sequences folding into a specific structure and to analyze their induced neutral networks. Since the established inverse folding algorithms, RNAinverse, RNA-SSD as well as INFO-RNA are limited to RNA secondary structures, we present in this paper the inverse folding algorithm Inv which can deal with 3-noncrossing, canonical pseudoknot structures.  相似文献   

18.
B D James  G J Olsen  J S Liu  N R Pace 《Cell》1988,52(1):19-26
Secondary structure models for the ribonuclease (RNAase) P RNAs of Bacillus subtilis and E. coli were derived by a phylogenetic comparative analysis of published sequences as well as four novel ones. The RNAase P RNA genes from Bacillus megaterium, Bacillus brevis, Bacillus stearothermophilus, and Pseudomonas fluorescens were cloned, sequenced, and compared with the other available sequences. Regions of pairing were identified by the occurrence of homologous complementary sequences that vary among the compared molecules. A common core of primary and secondary structure can be identified in all these RNAase P RNAs. The previously noted striking differences between the Bacillus and the enteric RNAase P RNAs arise not only from point mutations, but from the addition or deletion of structural domains. The primary and secondary structural features that are common to all of the RNAase P RNAs are likely to be the elements involved in the binding and cleavage of tRNA precursors, and in the interaction with the RNAase P protein.  相似文献   

19.
We investigated the relationship between RNA structure and folding rates accounting for hierarchical structural formation. Folding rates of two-state folding proteins correlate well with relative contact order, a quantitative measure of the number and sequence distance between tertiary contacts. These proteins do not form stable structures prior to the rate-limiting step. In contrast, most secondary structures are stably formed prior to the rate-limiting step in RNA folding. Accordingly, we introduce "reduced contact order", a metric that reflects only the number of residues available to participate in the conformational search after the formation of secondary structure. Plotting the folding rates and the reduced contact order from ten different RNAs suggests that RNA folding can be divided into two classes. To examine this division, folding rates of circularly permutated isomers are compared for two RNAs, one from each class. Folding rates vary by tenfold for circularly permuted Bacillus subtilis RNase P RNA isomers, whereas folding rates vary by only 1.2-fold for circularly permuted catalytic domains. This difference is likely related to the dissimilar natures of their rate-limiting steps.  相似文献   

20.
Accompanying recent advances in determining RNA secondary structure is the growing appreciation for the importance of relatively simple topological constraints, encoded at the secondary structure level, in defining the overall architecture, folding pathways, and dynamic adaptability of RNA. A new view is emerging in which tertiary interactions do not define RNA 3D structure, but rather, help select specific conformers from an already narrow, topologically pre-defined conformational distribution. Studies are providing fundamental insights into the nature of these topological constraints, how they are encoded by the RNA secondary structure, and how they interplay with other interactions, breathing new meaning to RNA secondary structure. New approaches have been developed that take advantage of topological constraints in determining RNA backbone conformation based on secondary structure, and a limited set of other, easily accessible constraints. Topological constraints are also providing a much-needed framework for rationalizing and describing RNA dynamics and structural adaptation. Finally, studies suggest that topological constraints may play important roles in steering RNA folding pathways. Here, we review recent advances in our understanding of topological constraints encoded by the RNA secondary structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号