首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The process of designing novel RNA sequences by inverse RNA folding, available in tools such as RNAinverse and InfoRNA, can be thought of as a reconstruction of RNAs from secondary structure. In this reconstruction problem, no physical measures are considered as additional constraints that are independent of structure, aside of the goal to reach the same secondary structure as the input using energy minimization methods. An extension of the reconstruction problem can be formulated since in many cases of natural RNAs, it is desired to analyze the sequence and structure of RNA molecules using various physical quantifiable measures. In prior works that used secondary structure predictions, it has been shown that natural RNAs differ significantly from random RNAs in some of these measures. Thus, we relax the problem of reconstructing RNAs from secondary structure into reconstructing RNAs from shapes, and in turn incorporate physical quantities as constraints. This allows for the design of novel RNA sequences by inverse folding while considering various physical quantities of interest such as thermodynamic stability, mutational robustness, and linguistic complexity. At the expense of altering the number of nucleotides in stems and loops, for example, physical measures can be taken into account. We use evolutionary computation for the new reconstruction problem and illustrate the procedure on various natural RNAs.  相似文献   

2.
As the raw material for evolution, arbitrary RNA sequences represent the baseline for RNA structure formation and a standard to which evolved structures can be compared. Here, we set out to probe, using physical and chemical methods, the structural properties of RNAs having randomly generated oligonucleotide sequences that were of sufficient length and information content to encode complex, functional folds, yet were unbiased by either genealogical or functional constraints. Typically, these unevolved, nonfunctional RNAs had sequence-specific secondary structure configurations and compact magnesium-dependent conformational states comparable to those of evolved RNA isolates. But unlike evolved sequences, arbitrary sequences were prone to having multiple competing conformations. Thus, for RNAs the size of small ribozymes, natural selection seems necessary to achieve uniquely folding sequences, but not to account for the well-ordered secondary structures and overall compactness observed in nature.  相似文献   

3.

Background  

RNA exhibits a variety of structural configurations. Here we consider a structure to be tantamount to the noncrossing Watson-Crick and G-U-base pairings (secondary structure) and additional cross-serial base pairs. These interactions are called pseudoknots and are observed across the whole spectrum of RNA functionalities. In the context of studying natural RNA structures, searching for new ribozymes and designing artificial RNA, it is of interest to find RNA sequences folding into a specific structure and to analyze their induced neutral networks. Since the established inverse folding algorithms, RNAinverse, RNA-SSD as well as INFO-RNA are limited to RNA secondary structures, we present in this paper the inverse folding algorithm Inv which can deal with 3-noncrossing, canonical pseudoknot structures.  相似文献   

4.
In addition to characteristic structural properties imposed by evolutionary modification, evolved, single-stranded RNAs also display characteristic structural properties imposed by intrinsic physical constraints on RNA polymer folding. The balance of intrinsic and functionally selected characters in the folded conformation of evolved secondary structures was determined by comparing the predicted secondary structures of evolved and unevolved (random) RNA sequences. Though evolved conformations are significantly more ordered than conformations of random-sequence RNA, this analysis demonstrates that the majority of conformational order within evolved structures results not from evolutionary optimization but from constraints imposed by rules intrinsic to RNA polymer folding. Received: 25 November 1998 / Accepted: 12 February 1999  相似文献   

5.
MOTIVATION: The structure of RNA molecules is often crucial for their function. Therefore, secondary structure prediction has gained much interest. Here, we consider the inverse RNA folding problem, which means designing RNA sequences that fold into a given structure. RESULTS: We introduce a new algorithm for the inverse folding problem (INFO-RNA) that consists of two parts; a dynamic programming method for good initial sequences and a following improved stochastic local search that uses an effective neighbor selection method. During the initialization, we design a sequence that among all sequences adopts the given structure with the lowest possible energy. For the selection of neighbors during the search, we use a kind of look-ahead of one selection step applying an additional energy-based criterion. Afterwards, the pre-ordered neighbors are tested using the actual optimization criterion of minimizing the structure distance between the target structure and the mfe structure of the considered neighbor. We compared our algorithm to RNAinverse and RNA-SSD for artificial and biological test sets. Using INFO-RNA, we performed better than RNAinverse and in most cases, we gained better results than RNA-SSD, the probably best inverse RNA folding tool on the market. AVAILABILITY: www.bioinf.uni-freiburg.de?Subpages/software.html.  相似文献   

6.
MOTIVATION: Most non-coding RNAs are characterized by a specific secondary and tertiary structure that determines their function. Here, we investigate the folding energy of the secondary structure of non-coding RNA sequences, such as microRNA precursors, transfer RNAs and ribosomal RNAs in several eukaryotic taxa. Statistical biases are assessed by a randomization test, in which the predicted minimum free energy of folding is compared with values obtained for structures inferred from randomly shuffling the original sequences. RESULTS: In contrast with transfer RNAs and ribosomal RNAs, the majority of the microRNA sequences clearly exhibit a folding free energy that is considerably lower than that for shuffled sequences, indicating a high tendency in the sequence towards a stable secondary structure. A possible usage of this statistical test in the framework of the detection of genuine miRNA sequences is discussed.  相似文献   

7.
As one of the earliest problems in computational biology, RNA secondary structure prediction (sometimes referred to as "RNA folding") problem has attracted attention again, thanks to the recent discoveries of many novel non-coding RNA molecules. The two common approaches to this problem are de novo prediction of RNA secondary structure based on energy minimization and the consensus folding approach (computing the common secondary structure for a set of unaligned RNA sequences). Consensus folding algorithms work well when the correct seed alignment is part of the input to the problem. However, seed alignment itself is a challenging problem for diverged RNA families. In this paper, we propose a novel framework to predict the common secondary structure for unaligned RNA sequences. By matching putative stacks in RNA sequences, we make use of both primary sequence information and thermodynamic stability for prediction at the same time. We show that our method can predict the correct common RNA secondary structures even when we are given only a limited number of unaligned RNA sequences, and it outperforms current algorithms in sensitivity and accuracy.  相似文献   

8.
RNA molecules with structure dependent functions are uniquely folded   总被引:3,自引:3,他引:0  
  相似文献   

9.
Ndifon W 《Bio Systems》2005,82(3):257-265
The kinetic folding of RNA sequences into secondary structures is modeled as a complex adaptive system, the components of which are possible RNA structural rearrangements (SRs) and their associated bases and base pairs. RNA bases and base pairs engage in local stacking interactions that determine the probabilities (or fitnesses) of possible SRs. Meanwhile, selection operates at the level of SRs; an autonomous stochastic process periodically (i.e., from one time step to another) selects a subset of possible SRs for realization based on the fitnesses of the SRs. Using examples based on selected natural and synthetic RNAs, the model is shown to reproduce characteristic (nonlinear) RNA folding dynamics such as the attainment by RNAs of alternative stable states. Possible applications of the model to the analysis of properties of fitness landscapes, and of the RNA sequence-to-structure mapping are discussed.  相似文献   

10.
RNA molecules, which are found in all living cells, fold into characteristic structures that account for their diverse functional activities. Many of these RNA structures consist of a collection of fundamental RNA motifs. The various combinations of RNA basic components form different RNA classes and define their unique structural and functional properties. The availability of many genome sequences makes it possible to search computationally for functional RNAs. Biological experiments indicate that functional RNAs have characteristic RNA structural motifs represented by specific combinations of base pairings and conserved nucleotides in the loop regions. The searching for those well-ordered RNA structures and their homologues in genomic sequences is very helpful for the understanding of RNA-based gene regulation. In this paper, we consider the following problem: given an RNA sequence with a known secondary structure, efficiently determine candidate segments in genomic sequences that can potentially form RNA secondary structures similar to the given RNA secondary structure. Our new bottom-up approach searches all potential stem-loops similar to ones of the given RNA secondary structure first, and then based on located stem-loops, detects potential homologous structural RNAs in genomic sequences.  相似文献   

11.
Beniaminov  A. D.  Ulyanov  N. B.  Samokhin  A. B.  Ivanov  V. I.  Du  Z.  Minyat  E. E. 《Molecular Biology》2003,37(3):446-455
The slipped loop structure, earlier identified as an unusual DNA structure, was found to be a possible element of the RNA folding. In order to experimentally test this suggestion, model oligoribonucleotides capable of forming the SLS were synthesized. Treatment of the oligoribonucleotides with nuclease S1 and RNases specific for single- and double-stranded RNA demonstrated the steric possibility of SLS formation. To determine the possible functional role of SLS-RNA, various naturally occurring RNAs were screened in silico. Among the most interesting findings were dimerization initiation sites of avian retroviral genomic RNAs. Analysis of RNA from 31 viruses showed that formation of the intermolecular SLS during RNA dimerization is theoretically possible, competing with the formation of an alternative hairpin structure. Identification of the secondary structure of selected RNA dimers employing nuclease digestion techniques as well as covariance analysis of the retroviral RNA dimerization initiation site sequences were used to show that the alternative conformation (loop–loop interaction of two hairpins, or kissing hairpins) is the most preferred. Alternative structures and conformational transitions in RNA dimerization mechanisms in avian retroviruses are discussed.  相似文献   

12.

Background

Ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Their secondary structures are crucial for the RNA functionality, and the prediction of the secondary structures is widely studied. Our previous research shows that cutting long sequences into shorter chunks, predicting secondary structures of the chunks independently using thermodynamic methods, and reconstructing the entire secondary structure from the predicted chunk structures can yield better accuracy than predicting the secondary structure using the RNA sequence as a whole. The chunking, prediction, and reconstruction processes can use different methods and parameters, some of which produce more accurate predictions than others. In this paper, we study the prediction accuracy and efficiency of three different chunking methods using seven popular secondary structure prediction programs that apply to two datasets of RNA with known secondary structures, which include both pseudoknotted and non-pseudoknotted sequences, as well as a family of viral genome RNAs whose structures have not been predicted before. Our modularized MapReduce framework based on Hadoop allows us to study the problem in a parallel and robust environment.

Results

On average, the maximum accuracy retention values are larger than one for our chunking methods and the seven prediction programs over 50 non-pseudoknotted sequences, meaning that the secondary structure predicted using chunking is more similar to the real structure than the secondary structure predicted by using the whole sequence. We observe similar results for the 23 pseudoknotted sequences, except for the NUPACK program using the centered chunking method. The performance analysis for 14 long RNA sequences from the Nodaviridae virus family outlines how the coarse-grained mapping of chunking and predictions in the MapReduce framework exhibits shorter turnaround times for short RNA sequences. However, as the lengths of the RNA sequences increase, the fine-grained mapping can surpass the coarse-grained mapping in performance.

Conclusions

By using our MapReduce framework together with statistical analysis on the accuracy retention results, we observe how the inversion-based chunking methods can outperform predictions using the whole sequence. Our chunk-based approach also enables us to predict secondary structures for very long RNA sequences, which is not feasible with traditional methods alone.
  相似文献   

13.
In general RNA prediction problem includes genetic mapping, physical mapping and structure prediction. The ultimate goal of structure prediction is to obtain the three dimensional structure of bimolecules through computation. The key concept for solving the above mentioned problem is the appropriate representation of the biological structures. Even though, the problems that concern representations of certain biological structures like secondary structures either are characterized as NP-complete or with high complexity, few approximation algorithms and techniques had been constructed, mainly with polynomial complexity, concerning the prediction of RNA secondary structures. In this paper, a new class of Motzkin paths is introduced, the so-called semi-elevated inverse Motzkin peakless paths for the representation of two interacting RNA molecules. The basic combinatorial interpretations on single RNA secondary structures are extended via these new Motzkin paths on two RNA molecules and can be applied to the prediction methods of joint structures formed by interacting RNAs.  相似文献   

14.
15.
Accompanying recent advances in determining RNA secondary structure is the growing appreciation for the importance of relatively simple topological constraints, encoded at the secondary structure level, in defining the overall architecture, folding pathways, and dynamic adaptability of RNA. A new view is emerging in which tertiary interactions do not define RNA 3D structure, but rather, help select specific conformers from an already narrow, topologically pre-defined conformational distribution. Studies are providing fundamental insights into the nature of these topological constraints, how they are encoded by the RNA secondary structure, and how they interplay with other interactions, breathing new meaning to RNA secondary structure. New approaches have been developed that take advantage of topological constraints in determining RNA backbone conformation based on secondary structure, and a limited set of other, easily accessible constraints. Topological constraints are also providing a much-needed framework for rationalizing and describing RNA dynamics and structural adaptation. Finally, studies suggest that topological constraints may play important roles in steering RNA folding pathways. Here, we review recent advances in our understanding of topological constraints encoded by the RNA secondary structure.  相似文献   

16.
A computer program is presented which determines the secondary structure of linear RNA molecules by simulating a hypothetical process of folding. This process implies the concept of 'nucleation centres', regions in RNA which locally trigger the folding. During the simulation, the RNA is allowed to fold into pseudoknotted structures, unlike all other programs predicting RNA secondary structure. The simulation uses published, experimentally determined free energy values for nearest neighbour base pair stackings and loop regions, except for new extrapolated values for loops larger than seven nucleotides. The free energy value for a loop arising from pseudoknot formation is set to a single, estimated value of 4.2 kcal/mole. Especially in the case of long RNA sequences, our program appears superior to other secondary structure predicting programs described so far, as tests on tRNAs, the LSU intron of Tetrahymena thermophila and a number of plant viral RNAs show. In addition, pseudoknotted structures are often predicted successfully. The program is written in mainframe APL and is adapted to run on IBM compatible PCs, Atari ST and Macintosh personal computers. On an 8 MHz 8088 standard PC without coprocessor, using STSC APL, it folds a sequence of 700 nucleotides in one and a half hour.  相似文献   

17.

Background  

We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm.  相似文献   

18.
A linear segment in which a number of pairs of intervals of equal length are identified as potential stems is the subject of a folding problem analogous to inference of RNA secondary structure. A quantity of free energy (or equivalently, energy per unit length) is associated with each stem, and the various types of loops are assigned energy costs as a function of their lengths. Inference of stable structures can then be carried out in the same way as in RNA folding. More important, perturbation of stem lengths and energy densities (modelling various mutational processes affecting nucleotide sequences) allows the delineation of domains of stability of various foldings, through the explicit calculation of their boundaries, in a low-dimensional parameter space.  相似文献   

19.
RNA multi-structure landscapes   总被引:6,自引:0,他引:6  
Statistical properties of RNA folding landscapes obtained by the partition function algorithm (McCaskill 1990) are investigated in detail. The pair correlation of free energies as a function of the Hamming distance is used as a measure for the ruggedness of the landscape. The calculation of the partition function contains information about the entire ensemble of secondary structures as a function of temperature and opens the door to all quantities of thermodynamic interest, in contrast with the conventional minimal free energy approach. A metric distance of structure ensembles is introduced and pair correlations at the level of the structures themselves are computed. Just as with landscapes based on most stable secondary structure prediction, the landscapes defined on the full biophysical GCAU alphabet are much smoother than the landscapes restricted to pure GC sequences and the correlation lengths are almost constant fractions of the chain lengths. Correlation functions for multi-structure landscapes exhibit an increased correlation length, especially near the melting temperature. However, the main effect on evolution is rather an effective increase in sampling for finite populations where each sequence explores multiple structures. Correspondence to: P. Schuster  相似文献   

20.
How RNA folds.   总被引:9,自引:0,他引:9  
We describe the RNA folding problem and contrast it with the much more difficult protein folding problem. RNA has four similar monomer units, whereas proteins have 20 very different residues. The folding of RNA is hierarchical in that secondary structure is much more stable than tertiary folding. In RNA the two levels of folding (secondary and tertiary) can be experimentally separated by the presence or absence of Mg2+. Secondary structure can be predicted successfully from experimental thermodynamic data on secondary structure elements: helices, loops, and bulges. Tertiary interactions can then be added without much distortion of the secondary structure. These observations suggest a folding algorithm to predict the structure of an RNA from its sequence. However, to solve the RNA folding problem one needs thermodynamic data on tertiary structure interactions, and identification and characterization of metal-ion binding sites. These data, together with force versus extension measurements on single RNA molecules, should provide the information necessary to test and refine the proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号