首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
A kinetic approach to the prediction of RNA secondary structures   总被引:3,自引:0,他引:3  
A new approach to the prediction of secondary RNA structures based on the analysis of the kinetics of molecular self-organisation is proposed herein. The Markov process is used to describe structural reconstructions during secondary structure formation. This process is modelled by a Monte-Carlo method. Examples of the calculation by this method of the secondary structures kinetic ensemble are given. Distribution of time-dependent probabilities within the ensembles is obtained. An effective method for search for the equilibrium ensemble is also suggested. This method is based on the construction of a tree of all possible secondary structures of RNA. By ascribing a probability for each structure (according to its free energy) the Boltzmann equilibrium ensemble can be obtained.  相似文献   

2.
Abstract

Measuring the (dis)similarity between RNA secondary structures is critical for the study of RNA secondary structures and has implications to RNA functional characterization. Although a number of methods have been developed for comparing RNA structural similarities, their applications have been limited by the complexity of the required computation. In this paper, we present a novel method for comparing the similarity of RNA secondary structures generated from the same RNA sequence, i.e., a secondary structure ensemble, using a matrix representation of the RNA structures. Relevant features of the RNA secondary structures can be easily extracted through singular value decomposition (SVD) of the representing matrices. We have mapped the feature vectors of the singular values to a kernel space, where (dis)similarities among the mapped feature vectors become more evident, making clustering of RNA secondary structures easier to handle. The pair-wise comparison of RNA structures is achieved through computing the distance between the singular value vectors in the kernel space. We have applied a fuzzy kernel clustering method, using this similarity metric, to cluster the RNA secondary structure ensembles. Our application results suggest that our fuzzy kernel clustering method is highly promising for classifications of RNA structure ensembles, because of its low computational complexity and high clustering accuracy.  相似文献   

3.
Abstract

In this paper, we proposed a 6-D representation of RNA secondary structures. The use of the 6-D representation is illustrated by constructing structure invariants. Comparisons with the similarity/dissimilarity results based on 6-D representation for a set of RNA secondary structures, are considered to illustrate the use of our structure invariants based on the entries in derived sequence matrices restricted to a selected width of a band along the main diagonal.  相似文献   

4.
RNA multi-structure landscapes   总被引:6,自引:0,他引:6  
Statistical properties of RNA folding landscapes obtained by the partition function algorithm (McCaskill 1990) are investigated in detail. The pair correlation of free energies as a function of the Hamming distance is used as a measure for the ruggedness of the landscape. The calculation of the partition function contains information about the entire ensemble of secondary structures as a function of temperature and opens the door to all quantities of thermodynamic interest, in contrast with the conventional minimal free energy approach. A metric distance of structure ensembles is introduced and pair correlations at the level of the structures themselves are computed. Just as with landscapes based on most stable secondary structure prediction, the landscapes defined on the full biophysical GCAU alphabet are much smoother than the landscapes restricted to pure GC sequences and the correlation lengths are almost constant fractions of the chain lengths. Correlation functions for multi-structure landscapes exhibit an increased correlation length, especially near the melting temperature. However, the main effect on evolution is rather an effective increase in sampling for finite populations where each sequence explores multiple structures. Correspondence to: P. Schuster  相似文献   

5.
Abstract

This paper develops mathematical methods for describing and analyzing RNA secondary structures. It was motivated by the need to develop rigorous yet efficient methods to treat transitions from one secondary structure to another, which we propose here may occur as motions of loops within RNAs having appropriate sequences. In this approach a molecular sequence is described as a vector of the appropriate length. The concept of symmetries between nucleic acid sequences is developed, and the 48 possible different types of symmetries are described. Each secondary structure possible for a particular nucleotide sequence determines a symmetric, signed permutation matrix. The collection of all possible secondary structures is comprised of all matrices of this type whose left multiplication with the sequence vector leaves that vector unchanged. A transition between two secondary structures is given by the product of the two corresponding structure matrices. This formalism provides an efficient method for describing nucleic acid sequences that allows questions relating to secondary structures and transitions to be addressed using the powerful methods of abstract algebra. In particular, it facilitates the determination of possible secondary structures, including those containing pseudoknots. Although this paper concentrates on RNA structure, this formalism also can be applied to DNA  相似文献   

6.
Abstract

The secondary structures of Tetrahymena thermophila rRNA IVS sequence involved in the self-splicing reactions, are theoretically investigated with a refined computer method previously proposed, able to select a set of the deepest free energy RNA secondary structures under constraints of model hypotheses and experimental evidences. The secondary structures obtained are characterized by the close proximity of self-reactions sites and account for double mutations experiments, and differential digestion data.  相似文献   

7.
8.
Abstract

A modification of the Gibbs ensemble Monte Carlo computer simulation method for fluid phase equilibrium is described. The modification, which is based on the assumption of a thermodynamic model for the vapor phase, reduces the computational time for the simulation as compared to the original Gibbs ensemble methods. Since the computational time is largely proportional to the number of particle-particle interactions, avoiding the direct simulation of the vapor phase typically leads to a thirty to forty percent reduction in computational time. For a pure Leonard-Jones-(12,6) fluid the results obtained at moderate reduced temperatures, T/Tc < 0.8, are in good agreement with the full Gibbs ensemble method.  相似文献   

9.
Abstract

Complexity of functions evolving in an evolution process are expected to be limited by the time length of an evolution process among other factors. This paper outlines a general method of deriving function-complexity limitations based on mathematical statistics and independent from details of a biological or genetic mechanism of the evolution of the function. Limitations on the emergence of life are derived, these limitations indicate a possibility of a very fast evolution and are consistent with “RNA world” hypothesis. The discussed method is general and can be used to characterize evolution of more specific biological organism functions and relate functions to genetic structures. The derived general limitations indicate that a co-evolution of multiple functions and species could be a slow process, whereas an evolution of a specific function might proceed very fast, so that no trace of intermediate forms (species) is preserved in fossil records of phenotype or DNA structure; this is consistent with a picture of “punctuated equilibrium”.  相似文献   

10.
J S McCaskill 《Biopolymers》1990,29(6-7):1105-1119
A novel application of dynamic programming to the folding problem for RNA enables one to calculate the full equilibrium partition function for secondary structure and the probabilities of various substructures. In particular, both the partition function and the probabilities of all base pairs are computed by a recursive scheme of polynomial order N3 in the sequence length N. The temperature dependence of the partition function gives information about melting behavior for the secondary structure. The pair binding probabilities, the computation of which depends on the partition function, are visually summarized in a "box matrix" display and this provides a useful tool for examining the full ensemble of probable alternative equilibrium structures. The calculation of this ensemble representation allows a proper application and assessment of the predictive power of the secondary structure method, and yields important information on alternatives and intermediates in addition to local information about base pair opening and slippage. The results are illustrated for representative tRNA, 5S RNA, and self-replicating and self-splicing RNA molecules, and allow a direct comparison with enzymatic structure probes. The effect of changes in the thermodynamic parameters on the equilibrium ensemble provides a further sensitivity check to the predictions.  相似文献   

11.
Abstract

A simple statistical model describing the folding of nucleic acids is proposed. For long sequences the real configuration of the secondary structure is a quasi equilibrium state that cannot be characterised by minimal free energy. This is because the time required to achieve complete thermal equilibrium considerably exceeds the life-time of the molecule. The formation of the secondary structure is represented as a random walk process in the space of all possible molecular configurations. TTie quasi equilibrium structure is obtained by successive linking and disruptions of helix segments with probabilities determined by the rate constants of corresponding unimolecular reactions. The probabilities of configurations consisting of all possible compatible helices are calculated. Structures of some t - RNAs and ribosomal RNAs are analysed.  相似文献   

12.
Abstract

In this paper, we proposed a 3-D graphical representation of RNA secondary structures. Based on this representation, we outline an approach by constructing a 3-component vector whose components are the normalized leading eigenvalues of the L/L matrices associated with RNA secondary structure. The examination of similarities/dissimilarities among the secondary structure at the 3′-terminus of different viruses illustrates the utility of the approach.  相似文献   

13.

We are developing a program to calculate optimal RNA secondary structures. The model uses di-nucleotide pairing energies as with most traditional approaches. However, for long-range entropy interactions, the approach uses an entropy-loss model based on the accumulated sum of the entropy of bonding between each base-pair weighted inversely by the correlation of the RNA sequence (the Kuhn length). Stiff RNA forms very different structures from flexible RNA. The results demonstrate that the long-range folding is largely governed by this entropy and the Kuhn length.  相似文献   

14.
Abstract

A new modification of the Gibbs ensemble Monte Carlo computer simulation method for fluid phase equilibria is described. The modification is based on a thermodynamic model for the vapor phase, and uses an equation of state to account for the weak interactions between the vapor phase molecules. Reductions in the computational time by 30–40% as compared to the original Gibbs ensemble method are obtained. The algorithm is applied to Lennard-Jones - (12,6) fluids and their mixtures and the results are in good agreement with results obtained from simulations using the full Gibbs ensemble method.  相似文献   

15.
Predicting secondary structures of RNA molecules is one of the fundamental problems of and thus a challenging task in computational structural biology. Over the past decades, mainly two different approaches have been considered to compute predictions of RNA secondary structures from a single sequence: the first one relies on physics-based and the other on probabilistic RNA models. Particularly, the free energy minimization (MFE) approach is usually considered the most popular and successful method. Moreover, based on the paradigm-shifting work by McCaskill which proposes the computation of partition functions (PFs) and base pair probabilities based on thermodynamics, several extended partition function algorithms, statistical sampling methods and clustering techniques have been invented over the last years. However, the accuracy of the corresponding algorithms is limited by the quality of underlying physics-based models, which include a vast number of thermodynamic parameters and are still incomplete. The competing probabilistic approach is based on stochastic context-free grammars (SCFGs) or corresponding generalizations, like conditional log-linear models (CLLMs). These methods abstract from free energies and instead try to learn about the structural behavior of the molecules by learning (a manageable number of) probabilistic parameters from trusted RNA structure databases. In this work, we introduce and evaluate a sophisticated SCFG design that mirrors state-of-the-art physics-based RNA structure prediction procedures by distinguishing between all features of RNA that imply different energy rules. This SCFG actually serves as the foundation for a statistical sampling algorithm for RNA secondary structures of a single sequence that represents a probabilistic counterpart to the sampling extension of the PF approach. Furthermore, some new ways to derive meaningful structure predictions from generated sample sets are presented. They are used to compare the predictive accuracy of our model to that of other probabilistic and energy-based prediction methods. Particularly, comparisons to lightweight SCFGs and corresponding CLLMs for RNA structure prediction indicate that more complex SCFG designs might yield higher accuracy but eventually require more comprehensive and pure training sets. Investigations on both the accuracies of predicted foldings and the overall quality of generated sample sets (especially on an abstraction level, called abstract shapes of generated structures, that is relevant for biologists) yield the conclusion that the Boltzmann distribution of the PF sampling approach is more centered than the ensemble distribution induced by the sophisticated SCFG model, which implies a greater structural diversity within generated samples. In general, neither of the two distinct ensemble distributions is more adequate than the other and the corresponding results obtained by statistical sampling can be expected to bare fundamental differences, such that the method to be preferred for a particular input sequence strongly depends on the considered RNA type.  相似文献   

16.
A new approach to the problem of prediction of secondary structures of RNA, which is based on the kinetic analysis of self-organising molecules is proposed. Structural reconstructions that take place during formation of secondary structures are described in terms of Markov process. A set of states and probability transition were defined. Monte-Carlo methods were used to describe this process. Probability distributions of various secondary structures depending on time are given. Examples of calculations for ensembles of secondary structures of some tRNAs are described. An effective method of steady-state ensemble research, which is based on a quick RESETTING of all possible variance of the secondary structures of RNAs is given. By ascribing to each of these structures the value of probabilities as a function of free energy it was possible to obtain the Boltzmann ensemble of secondary structures.  相似文献   

17.
Abstract

Several protein structures have been reported to contain intricate knots of the polypeptide backbone but the mechanism of the (un)folding process of knotted proteins remains unknown. The members of the SPOUT superfamily of RNA methyltransferases are some of the most intensely studied systems for investigation of the knot formation and function. YibK (whose biochemical function remains unknown) is the representative protein of the SPOUT superfamily. This protein exhibits a deep trefoil knot at the C-terminus.

We conducted an extensive computational analysis of the unfolding process for the monomeric form of YibK. In order to predict the (un)folding pathway of YibK, we have calculated the order of secondary structure disassembly using UNFOLD, and performed thermal unfolding simulations using classical Molecular Dynamics (MD), as well as simulations employing reduced representation of the peptide chain using either MD with the UNRES method or the Monte Carlo (MC) unfolding with the REFINER method.

Results obtained from all methods used in this work are in qualitative agreement. We found that YibK unfolds through four intermediate states. The trefoil knot in YibK disappears at the end of the unfolding process, long after the protein loses its native topology. We observed that the C-terminus leaves the knotting loop folded into a hairpin-like structure, in agreement with the results of coarse-grained simulation reported earlier. We propose that the folding pathway of YibK corresponds to the reversed sequence of events observed in the unfolding pathway elucidated in this study. Thus, we predict that the knot formation is the slowest part of the YibK folding process.  相似文献   

18.
Abstract

Except for tRNA, the tertiary structure of RNA molecules are very little known. The many possibilities in the arrangement of different helices in space and the flexibility in the single- stranded loops that connect the helical regions make the modeling of the tertiary structure of RNA molecule a very complex task. Here, we introduce an approach to fold RNA tertiary structure based only on the information of the secondary structure and the stereochemistry of the molecule. This approach was used to construct an atomic structure of a pseudoknot (bases 500–545) in the E. coli 16S RNA. The resulting structure is a closely packed molecule that is consistent with the predicted secondary structure and stereochemically feasible. This new approach is very general and easily adaptable. Experimental data (e.g., NMR, fluorescence energy transfer, etc.), as they become available, can be incorporated directly into the approach to improve the accuracy of the modeled structure.  相似文献   

19.
20.
An RNA molecule, particularly a long-chain mRNA, may exist as a population of structures. Further more, multiple structures have been demonstrated to play important functional roles. Thus, a representation of the ensemble of probable structures is of interest. We present a statistical algorithm to sample rigorously and exactly from the Boltzmann ensemble of secondary structures. The forward step of the algorithm computes the equilibrium partition functions of RNA secondary structures with recent thermodynamic parameters. Using conditional probabilities computed with the partition functions in a recursive sampling process, the backward step of the algorithm quickly generates a statistically representative sample of structures. With cubic run time for the forward step, quadratic run time in the worst case for the sampling step, and quadratic storage, the algorithm is efficient for broad applicability. We demonstrate that, by classifying sampled structures, the algorithm enables a statistical delineation and representation of the Boltzmann ensemble. Applications of the algorithm show that alternative biological structures are revealed through sampling. Statistical sampling provides a means to estimate the probability of any structural motif, with or without constraints. For example, the algorithm enables probability profiling of single-stranded regions in RNA secondary structure. Probability profiling for specific loop types is also illustrated. By overlaying probability profiles, a mutual accessibility plot can be displayed for predicting RNA:RNA interactions. Boltzmann probability-weighted density of states and free energy distributions of sampled structures can be readily computed. We show that a sample of moderate size from the ensemble of an enormous number of possible structures is sufficient to guarantee statistical reproducibility in the estimates of typical sampling statistics. Our applications suggest that the sampling algorithm may be well suited to prediction of mRNA structure and target accessibility. The algorithm is applicable to the rational design of small interfering RNAs (siRNAs), antisense oligonucleotides, and trans-cleaving ribozymes in gene knock-down studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号