首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We adapt here a surprising technique, the boustrophedon method, to speed up the sampling of RNA secondary structures from the Boltzmann low-energy ensemble. This technique is simple and its implementation straight-forward, as it only requires a permutation in the order of some operations already performed in the stochastic traceback stage of these algorithms. It nevertheless greatly improves their worst-case complexity from to , for n the size of the original sequence. Moreover the average-case complexity of the generation is shown to be improved from to in an Boltzmann-weighted homopolymer model based on the Nussinov–Jacobson free-energy model. These results are extended to the more realistic Turner free-energy model through experiments performed on both structured (Drosophilia melanogaster mRNA 5S) and hybrid (Staphylococcus aureus RNAIII) RNA sequences, using a boustrophedon modified version of the popular software UnaFold. This improvement allows for the sampling of greater and more significant sets of structures in a given time.   相似文献   

2.
Ensemble-based approaches to RNA secondary structure prediction have become increasingly appreciated in recent years. Here, we utilize sampling and clustering of the Boltzmann ensemble of RNA secondary structures to investigate whether biological sequences exhibit ensemble features that are distinct from their random shuffles. Representative messenger RNAs (mRNAs), structural RNAs, and precursor microRNAs (miRNAs) are analyzed for nine ensemble features. These include structure clustering features, the energy gap between the minimum free energy (MFE) and the ensemble, the numbers of high-frequency base pairs in the ensemble and in clusters, the average base-pair distance between the MFE structure and the ensemble, and between-cluster and within-cluster sums of squares. For each of the features, we observe a lack of significant distinction between mRNAs and their random shuffles. For five features, significant differences are found between structural RNAs and random counterparts. For seven features including the five for structural RNAs, much greater differences are observed between precursor miRNAs and random shuffles. These findings reveal differences in the Boltzmann structure ensemble among different types of functional RNAs. In addition, for two ensemble features, we observe distinctive, non-overlapping distributions for precursor miRNAs and random shuffles. A distributional separation can be particularly useful for the prediction of miRNA genes.  相似文献   

3.
We suggest a new algorithm to search a given set of the RNA sequences for conserved secondary structures. The algorithm is based on alignment of the sequences for potential helical strands. This procedure can be used to search for new structured RNAs and new regulatory elements. It is efficient for the genome-scale analysis. The results of various tests run with this algorithm are shown.  相似文献   

4.
A mathematical model for analyzing the secondary structures of RNA is developed that is based on the connection matrix associated with the planar p-h graph. The classification of the elementary structures allows the introduction of the basis of structural space from which to build the global secondary structure. All admissible solutions belong to the configuration space and can be obtained directly from its basis.  相似文献   

5.
Within this paper we investigate the Bernoulli model for random secondary structures of ribonucleic acid (RNA) molecules. Assuming that two random bases can form a hydrogen bond with probability p we prove asymptotic equivalents for the averaged number of hairpins and bulges, the averaged loop length, the expected order, the expected number of secondary structures of size n and order k and further parameters all depending on p. In this way we get an insight into the change of shape of a random structure during the process . Afterwards we compare the computed parameters for random structures in the Bernoulli model to the corresponding quantities for real existing secondary structures of large subunit rRNA molecules found in the database of Wuyts et al. That is how it becomes possible to identify those parameters which behave (almost) randomly and those which do not and thus should be considered as interesting, e.g., with respect to the biological functions or the algorithmic prediction of RNA secondary structures.  相似文献   

6.
A statistical reference for RNA secondary structures with minimum free energies is computed by folding large ensembles of random RNA sequences. Four nucleotide alphabets are used: two binary alphabets, AU and GC, the biophysical AUGC and the synthetic GCXK alphabet. RNA secondary structures are made of structural elements, such as stacks, loops, joints, and free ends. Statistical properties of these elements are computed for small RNA molecules of chain lengths up to 100. The results of RNA structure statistics depend strongly on the particular alphabet chosen. The statistical reference is compared with the data derived from natural RNA molecules with similar base frequencies. Secondary structures are represented as trees. Tree editing provides a quantitative measure for the distance dt, between two structures. We compute a structure density surface as the conditional probability of two structures having distance t given that their sequences have distance h. This surface indicates that the vast majority of possible minimum free energy secondary structures occur within a fairly small neighborhood of any typical (random) sequence. Correlation lengths for secondary structures in their tree representations are computed from probability densities. They are appropriate measures for the complexity of the sequence-structure relation. The correlation length also provides a quantitative estimate for the mean sensitivity of structures to point mutations. © 1993 John Wiley & Sons, Inc.  相似文献   

7.
8.
9.
An algorithm for comparing multiple RNA secondary structures   总被引:1,自引:0,他引:1  
A new distributed computational procedure is presented for rapidlydetermining the similarity of multiple conformations of RNAsecondary structures. A data abstraction scheme is utilizedto reduce the quantity of data that must be handled to determinethe degree of similarity among multiple structures. The methodhas been used to compare 200 structures with easy visualizationof both those structures and substructures that are similarand those that are vastly different. It has the capability ofprocessing many more conformations as a function of researchrequirements. The algorithm is described as well as some suggestionsfor future uses and extensions. Received on October 29, 1987; accepted on May 4, 1988  相似文献   

10.
11.
A program for predicting significant RNA secondary structures   总被引:1,自引:0,他引:1  
We describe a program for the analysis of RNA secondary structure.There are two new features in this program. (i) To get vectorspeeds on a vector pipeline machine (such as Cray X-MP/24) wehave vectorized the secondary structure dynamic algorithm. (ii)The statistical significance of a locally ‘optimal’secondary structure is assessed by a Monte Carlo method. Theresults can be depicted graphically including profiles of thestability of local secondary structures and the distributionof the potentially significant secondary structures in the RNAmolecules. Interesting regions where both the potentially significantsecondary structures and ‘open’ structures (single-strandedcoils) occur can be identified by the plots mentioned above.Furthermore, the speed of the vectorized code allows repeatedMonte Carlo simulations with different overlapping window sizes.Thus, the optimal size of the significant secondary structureoccurring in the interesting region can be assessed by repeatingthe Monte Carlo simulation. The power of the program is demonstratedin the analysis of local secondary structures of human T-celllymphotrophic virus type III (HIV). Received on August 17, 1987; accepted on January 5, 1988  相似文献   

12.
Lorenz WA  Clote P 《PloS one》2011,6(1):e16178
An RNA secondary structure is locally optimal if there is no lower energy structure that can be obtained by the addition or removal of a single base pair, where energy is defined according to the widely accepted Turner nearest neighbor model. Locally optimal structures form kinetic traps, since any evolution away from a locally optimal structure must involve energetically unfavorable folding steps. Here, we present a novel, efficient algorithm to compute the partition function over all locally optimal secondary structures of a given RNA sequence. Our software, RNAlocopt runs in O(n3) time and O(n2) space. Additionally, RNAlocopt samples a user-specified number of structures from the Boltzmann subensemble of all locally optimal structures. We apply RNAlocopt to show that (1) the number of locally optimal structures is far fewer than the total number of structures--indeed, the number of locally optimal structures approximately equal to the square root of the number of all structures, (2) the structural diversity of this subensemble may be either similar to or quite different from the structural diversity of the entire Boltzmann ensemble, a situation that depends on the type of input RNA, (3) the (modified) maximum expected accuracy structure, computed by taking into account base pairing frequencies of locally optimal structures, is a more accurate prediction of the native structure than other current thermodynamics-based methods. The software RNAlocopt constitutes a technical breakthrough in our study of the folding landscape for RNA secondary structures. For the first time, locally optimal structures (kinetic traps in the Turner energy model) can be rapidly generated for long RNA sequences, previously impossible with methods that involved exhaustive enumeration. Use of locally optimal structure leads to state-of-the-art secondary structure prediction, as benchmarked against methods involving the computation of minimum free energy and of maximum expected accuracy. Web server and source code available at http://bioinformatics.bc.edu/clotelab/RNAlocopt/.  相似文献   

13.
The RNA secondary structure prediction is a classical problem in bioinformatics. The most efficient approach to this problem is based on the idea of a comparative analysis. In this approach the algorithms utilize multiple alignment of the RNA sequences and find common RNA structure. This paper describes a new algorithm for this task. This algorithm does not require predefined multiple alignment. The main idea of the algorithm is based on MEME-like iterative searching of abstract profile on different levels. On the first level the algorithm searches the common blocks in the RNA sequences and creates chain of this blocks. On the next step the algorithm refines the chain of common blocks. On the last stage the algorithm searches sets of common helices that have consistent locations relative to common blocks. The algorithm was tested on sets of tRNA with a subset of junk sequences and on RFN riboswitches. The algorithm is implemented as a web server (http://bioinf.fbb.msu.ru/RNAAlign/).  相似文献   

14.
Many different programs have been developed for the prediction of the secondary structure of an RNA sequence. Some of these programs generate an ensemble of structures, all of which have free energy close to that of the optimal structure, making it important to be able to quantify how similar these different structures are. To deal with this problem, we define a new class of metrics, the mountain metrics, on the set of RNA secondary structures of a fixed length. We compare properties of these metrics with other well known metrics on RNA secondary structures. We also study some global and local properties of these metrics.  相似文献   

15.
RNA secondary structures and their prediction   总被引:1,自引:0,他引:1  
This is a review of past and present attempts to predict the secondary structure of ribonucleic acids (RNAs) through mathematical and computer methods. Related areas covering classification, enumeration and graphical representations of structures are also covered. Various general prediction techniques are discussed, especially the use of thermodynamic criteria to construct an optimal structure. The emphasis in this approach is on the use of dynamic programming algorithms to minimize free energy. One such algorithm is introduced which comprises existing ones as special cases. Issued as NRCC No. 23684.  相似文献   

16.
Base-pair probability profiles of RNA secondary structures   总被引:7,自引:0,他引:7  
Dynamic programming algorithms are able to predict optimal andsuboptimal secondary structures of RNA. These suboptimal oralternative secondary structures are important for the biologicalfunction of RNA. The distribution of secondary structures presentin solution is governed by the thermodynamic equilibrium betweenthe different structures. An algorithm is presented which approximatesthe total partition function by a Boltzmann–weighted summationof optimal and suboptimal secondary structures at several temperatures.A clear representation of the equilibrium distribution of secondarystructures is derived from a two-dimensional bonding matrixwith base–pairing probability as the third dimension.The temperature dependence of the equilibrium distribution givesthe denaturation behavior of the nucleic acid, which may becompared to experimental optical denaturation curves after correctionfor the hypochromicities of the different base-pairs. Similarly,temperature-induced mobility changes detected in temperature-gradientgel electrophoresis of nucleic acids may be interpreted on thebasis of the temperature dependence of the equilibrium distribution.Results are illustrated for natural circular and synthetic linearpotato spindle tuber viroid RNA respectively, and are comparedto experimental data.  相似文献   

17.
The secondary structure of an RNA molecule is of great importance and possesses influence, e.g., on the interaction of tRNA molecules with proteins or on the stabilization of mRNA molecules. The classification of secondary structures by means of their order proved useful with respect to numerous applications. In 1978, Waterman, who gave the first precise formal framework for the topic, suggested to determine the number a(n,p) of secondary structures of size n and given order p. Since then, no satisfactory result has been found. Based on an observation due to Viennot et al., we will derive generating functions for the secondary structures of order p from generating functions for binary tree structures with Horton-Strahler number p. These generating functions enable us to compute a precise asymptotic equivalent for a(n,p). Furthermore, we will determine the related number of structures when the number of unpaired bases shows up as an additional parameter. Our approach proves to be general enough to compute the average order of a secondary structure together with all the r-th moments and to enumerate substructures such as hairpins or bulges in dependence on the order of the secondary structures considered.  相似文献   

18.
BACKGROUND: A small class of RNA molecules, in particular the tiny genomes of viroids, are circular. Yet most structure prediction algorithms handle only linear RNAs. The most straightforward approach is to compute circular structures from 'internal' and 'external' substructures separated by a base pair. This is incompatible, however, with the memory-saving approach of the Vienna RNA Package which builds a linear RNA structure from shorter (internal) structures only. RESULT: Here we describe how circular secondary structures can be obtained without additional memory requirements as a kind of 'post-processing' of the linear structures. AVAILABILITY: The circular folding algorithm is implemented in the current version of the of RNAfold program of the Vienna RNA Package, which can be downloaded from http://www.tbi.univie.ac.at/RNA/  相似文献   

19.
Computer-aided prediction of RNA secondary structures.   总被引:3,自引:5,他引:3       下载免费PDF全文
A brief survey of computer algorithms that have been developed to generate predictions of the secondary structures of RNA molecules is presented. Two particular methods are described in some detail. The first utilizes a thermodynamic energy minimization algorithm that takes into account the likelihood that short-range folding tends to be favored over long-range interactions. The second utilizes an interactive computer graphic modelling algorithm that enables the user to consider thermodynamic criteria as well as structural data obtained by nuclease susceptibility, chemical reactivity and phylogenetic studies. Examples of structures for prokaryotic 16S and 23S ribosomal RNAs, several eukaryotic 5S ribosomal RNAs and rabbit beta-globin messenger RNA are presented as case studies in order to describe the two techniques. Anm argument is made for integrating the two approaches presented in this paper, enabling the user to generate proposed structures using thermodynamic criteria, allowing interactive refinement of these structures through the application of experimentally derived data.  相似文献   

20.
It is a classical result of Stein and Waterman that the asymptotic number of RNA secondary structures is $1.104366 \cdot n^{-3/2} \cdot 2.618034^n$ . Motivated by the kinetics of RNA secondary structure formation, we are interested in determining the asymptotic number of secondary structures that are locally optimal, with respect to a particular energy model. In the Nussinov energy model, where each base pair contributes $-1$ towards the energy of the structure, locally optimal structures are exactly the saturated structures, for which we have previously shown that asymptotically, there are $1.07427\cdot n^{-3/2} \cdot 2.35467^n$ many saturated structures for a sequence of length $n$ . In this paper, we consider the base stacking energy model, a mild variant of the Nussinov model, where each stacked base pair contributes $-1$ toward the energy of the structure. Locally optimal structures with respect to the base stacking energy model are exactly those secondary structures, whose stems cannot be extended. Such structures were first considered by Evers and Giegerich, who described a dynamic programming algorithm to enumerate all locally optimal structures. In this paper, we apply methods from enumerative combinatorics to compute the asymptotic number of such structures. Additionally, we consider analogous combinatorial problems for secondary structures with annotated single-stranded, stacking nucleotides (dangles).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号