首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Dynamic programming algorithms that predict RNA secondary structure by minimizing the free energy have had one important limitation. They were able to predict only one optimal structure. Given the uncertainties of the thermodynamic data and the effects of proteins and other environmental factors on structure, the optimal structure predicted by these methods may not have biological significance. We present a dynamic programming algorithm that can determine optimal and suboptimal secondary structures for an RNA. The power and utility of the method is demonstrated in the folding of the intervening sequence of the rRNA of Tetrahymena. By first identifying the major secondary structures corresponding to the lowest free energy minima, a secondary structure of possible biological significance is derived.  相似文献   

2.
Free energy minimization has been the most popular method for RNA secondary structure prediction for decades. It is based on a set of empirical free energy change parameters derived from experiments using a nearest-neighbor model. In this study, a program, MaxExpect, that predicts RNA secondary structure by maximizing the expected base-pair accuracy, is reported. This approach was first pioneered in the program CONTRAfold, using pair probabilities predicted with a statistical learning method. Here, a partition function calculation that utilizes the free energy change nearest-neighbor parameters is used to predict base-pair probabilities as well as probabilities of nucleotides being single-stranded. MaxExpect predicts both the optimal structure (having highest expected pair accuracy) and suboptimal structures to serve as alternative hypotheses for the structure. Tested on a large database of different types of RNA, the maximum expected accuracy structures are, on average, of higher accuracy than minimum free energy structures. Accuracy is measured by sensitivity, the percentage of known base pairs correctly predicted, and positive predictive value (PPV), the percentage of predicted pairs that are in the known structure. By favoring double-strandedness or single-strandedness, a higher sensitivity or PPV of prediction can be favored, respectively. Using MaxExpect, the average PPV of optimal structure is improved from 66% to 68% at the same sensitivity level (73%) compared with free energy minimization.  相似文献   

3.
RNA secondary structure prediction using free energy minimization is one method to gain an approximation of structure. Constraints generated by enzymatic mapping or chemical modification can improve the accuracy of secondary structure prediction. We report a facile method that identifies single-stranded regions in RNA using short, randomized DNA oligonucleotides and RNase H cleavage. These regions are then used as constraints in secondary structure prediction. This method was used to improve the secondary structure prediction of Escherichia coli 5S rRNA. The lowest free energy structure without constraints has only 27% of the base pairs present in the phylogenetic structure. The addition of constraints from RNase H cleavage improves the prediction to 100% of base pairs. The same method was used to generate secondary structure constraints for yeast tRNAPhe, which is accurately predicted in the absence of constraints (95%). Although RNase H mapping does not improve secondary structure prediction, it does eliminate all other suboptimal structures predicted within 10% of the lowest free energy structure. The method is advantageous over other single-stranded nucleases since RNase H is functional in physiological conditions. Moreover, it can be used for any RNA to identify accessible binding sites for oligonucleotides or small molecules.  相似文献   

4.
New results for calculating nucleic acid secondary structure by free energy minimization and phylogenetic comparisons have recently been reported. A complete set of DNA energy parameters is now available and the RNA parameters have been improved. Although databases of RNA secondary structures are still derived and expanded using computer-assisted, ad hoc comparative analysis, a number of new computer algorithms combine covariation analysis with energy methods.  相似文献   

5.
A method for assessing the statistical significance of RNA folding   总被引:9,自引:0,他引:9  
We have developed a statistical method that is designed for analyzing potential RNA folded substructures. The statistical significance of RNA folding is assessed by the segment score. The segment score is defined as the difference between the lowest free energy calculated for the real biological sequence and the mean of the lowest free energies from random permutations of the real segment sequence, divided by the standard deviation of the random sample. This procedure was applied to the well-studied Escherichia coli 16S rRNA and potato spindle tuber viroid (PSTV) RNA. The results showed that the predictions of the locally significant secondary structures in these two molecules are in accord with the universally conserved local secondary structure elements (Gutell, Weiser & Noller, 1985, Prog. Nucl. Acid Res. molec. Biol. 32, 155-216; Riesner & Gross, 1985, A. Rev. Biochem. 54, 531-564). In addition, a statistical analysis indicated that the lowest free energies of a random sample set follow an approximately normal distribution. A reasonable size for the random sample set was determined statistically. Moreover, the statistical evaluation has been carried out using three different sets of energy rules--two sets (Salser, 1977, Cold Spring Harb. Symp. Quant Biol. 42, 985-1002; Freier, Kierzek, Jaeger, Sugimoto, Caruthers, Neilson & Turner, 1986, Proc. natn. Acad. Sci. U.S.A. 83, 9373-9377) take into account stacking energies and are based on experimental data and their computational extension (Salser, 1977)--the third set is a simplistic "unitary matrix" approach, where any base-pair is given a weight of "minus one" and an unpaired based is "zero". The Freier energy rules usually yield the strongest indication of significant folding region. However, the results derived from paired comparisons test don't provide sufficient evidence for concluding that a different set of energy rules is effective in changing the segment score level for local stem-loop structures in the 16S rRNA.  相似文献   

6.
MOTIVATION: Function derives from structure, therefore, there is need for methods to predict functional RNA structures. RESULTS: The Dynalign algorithm, which predicts the lowest free energy secondary structure common to two unaligned RNA sequences, is extended to the prediction of a set of low-energy structures. Dot plots can be drawn to show all base pairs in structures within an energy increment. Dynalign predicts more well-defined structures than structure prediction using a single sequence; in 5S rRNA sequences, the average number of base pairs in structures with energy within 20% of the lowest energy structure is 317 using Dynalign, but 569 using a single sequence. Structure prediction with Dynalign can also be constrained according to experiment or comparative analysis. The accuracy, measured as sensitivity and positive predictive value, of Dynalign is greater than predictions with a single sequence. AVAILABILITY: Dynalign can be downloaded at http://rna.urmc.rochester.edu  相似文献   

7.
The algorithm and the program for the prediction of RNA secondary structure with pseudoknot formation have been proposed. The algorithm simulates stepwise folding by generating random structures using Monte Carlo method, followed by the selection of helices to final structure on the basis of both their probabilities of occurrence in a random structure and free energy parameters. The program versions have been tested on ribosomal RNA structures and on RNAs with pseudoknots evidenced by experimental data. It is shown that the simulation of folding during RNA synthesis improves the results. The introduction of pseudoknot formation permits to predict the pseudoknotted structures and to improve the prediction of long-range interactions. The computer program is rather fast and allows to predict the structures for long RNAs without using large memory volumes in usual personal computer.  相似文献   

8.
An improved dynamic programming algorithm is reported for RNA secondary structure prediction by free energy minimization. Thermodynamic parameters for the stabilities of secondary structure motifs are revised to include expanded sequence dependence as revealed by recent experiments. Additional algorithmic improvements include reduced search time and storage for multibranch loop free energies and improved imposition of folding constraints. An extended database of 151,503 nt in 955 structures? determined by comparative sequence analysis was assembled to allow optimization of parameters not based on experiments and to test the accuracy of the algorithm. On average, the predicted lowest free energy structure contains 73 % of known base-pairs when domains of fewer than 700 nt are folded; this compares with 64 % accuracy for previous versions of the algorithm and parameters. For a given sequence, a set of 750 generated structures contains one structure that, on average, has 86 % of known base-pairs. Experimental constraints, derived from enzymatic and flavin mononucleotide cleavage, improve the accuracy of structure predictions.  相似文献   

9.
We have applied the Pipas-McMahon algorithm based on free energy calculations to the search for a 5S RNA base-pair structure common to all known sequences. We find that a 'Y' shaped model is consistently among the structures having the lowest free energy using 5S RNA sequences from either eukaryotic or prokaryotic sources. Compaison of this 'Y' structure with models which have recently been proposed show these models to be remarkably similar, and the minor differences are explicable based on the technique used to obtain the model. That prokaryotic and eukaryotic 5S RNA can adopt a similar secondary structure is strong support for its resistance to change during evolution.  相似文献   

10.
We have developed a method for detecting more stable and significantfolding regions relative to others in the sequence. The algorithmis based on the calculation of the lowest free energy of RNAsecondary structures and Monte Carlo simulation. For any givenRNA segment, the stability and statistical significance of RNAfolding are assessed by two measures: the stability score andthe significance score. The stability score measures the degreeof thermodynamic stability of the segment between all possiblebiological segments in the RNA sequence. The significance scorecharacterizes the specific arrangement of the nucleotides inthe segment that could imply a structural role for the sequenceinformation. Using these two measures, we are able to detecta series of distinct folding regions where highly stable andstatistically significant secondary structures occur in humanimmunodeficiency virus (HIV) and simian immunodeficiency virus(SIV) sequences. Received on April 4, 1990; accepted on October 2, 1990  相似文献   

11.
MOTIVATION: A k-point mutant of a given RNA sequence s = s(1), ..., s(n) is an RNA sequence s' = s'(1),..., s'(n) obtained by mutating exactly k-positions in s; i.e. Hamming distance between s and s' equals k. To understand the effect of pointwise mutation in RNA, we consider the distribution of energies of all secondary structures of k-point mutants of a given RNA sequence. RESULTS: Here we describe a novel algorithm to compute the mean and standard deviation of energies of all secondary structures of k-point mutants of a given RNA sequence. We then focus on the tail of the energy distribution and compute, using the algorithm AMSAG, the k-superoptimal structure; i.e. the secondary structure of a < or =k-point mutant having least free energy over all secondary structures of all k'-point mutants of a given RNA sequence, for k' < or = k. Evidence is presented that the k-superoptimal secondary structure is often closer, as measured by base pair distance and two additional distance measures, to the secondary structure derived by comparative sequence analysis than that derived by the Zuker minimum free energy structure of the original (wild type or unmutated) RNA.  相似文献   

12.
Jang S  Kim E  Pak Y 《Proteins》2007,66(1):53-60
Recently, we have shown that a modified energy model based on the param99 force field with the generalized Born (GB) solvation model produces reliable free energy landscapes of mini-proteins with a betabetaalpha motif (BBA5, 1FSD, and 1PSV), with the native structures of the mini-proteins located in their lowest free energy minimum states. One of the main features in the modified energy model is a significant improvement for more balanced treatments of alpha and beta strands in proteins. In this study, using the replica exchange molecular dynamics (REMD) simulation method with this new force field, we have carried out extensive ab initio folding studies of several well-known peptides with alpha or beta strands (C-peptide, EK-peptide, le0q, and gbl). Starting from fully extended conformations as the initial conditions, all of the native-like structures of the target peptides were successfully identified by REMD, with reasonable representations of free energy surfaces. The present simulation results with the modified energy model are consistent with experiments, demonstrating an extended applicability of the energy model to folding studies of a variety of alpha-helices, beta-strands, and alpha/beta proteins.  相似文献   

13.
Flexible docking between a protein (lysozyme) and an inhibitor (tri-N-acetyl-D-glucosamine, tri-NAG) was carried out by an enhanced conformational sampling method, multicanonical molecular dynamics simulation. We used a flexible all-atom model to express lysozyme, tri-NAG, and water molecules surrounding the two bio-molecules. The advantages of this sampling method are as follows: the conformation of system is widely sampled without trapping at energy minima, a thermally equilibrated conformational ensemble at an arbitrary temperature can be reconstructed from the simulation trajectory, and the thermodynamic weight can be assigned to each sampled conformation. During the simulation, exchanges between the binding and free (i.e., unbinding) states of the protein and the inhibitor were repeatedly observed. The conformational ensemble reconstructed at 300 K involved various conformational clusters. The main outcome of the current study is that the most populated conformational cluster (i.e., the cluster of the lowest free energy) was assigned to the native complex structure (i.e., the X-ray complex structure). The simulation also produced non-native complex structures, where the protein and the inhibitor bound with different modes from that of the native complex structure, as well as the unbinding structures. A free-energy barrier (i.e., activation free energy) was clearly detected between the native complex structures and the other structures. The thermal fluctuations of tri-NAG in the lowest free-energy complex correlated well with the X-ray B-factors of tri-NAG in the X-ray complex structure. The existence of the free-energy barrier ensures that the lowest free-energy structure can be discriminated naturally from the other structures. In other words, the multicanonical molecular dynamics simulation can predict the native complex structure without any empirical objective function. The current study also manifested that the flexible all-atom model and the physico-chemically defined atomic-level force field can reproduce the native complex structure. A drawback of the current method is that it requires a time consuming computation due to the exhaustive conformational sampling. We discussed a possibility for combining the current method with conventional docking methods.  相似文献   

14.
We present a computer method to determine nucleic acid secondary structures. It is based on three steps: 1) the search for all possible helical regions relied on a mathematical approach derived from the convolution theorem; it uses a tetradimensional complex vector representation of the bases along the sequence; 2) a 'tree' search for a set of minimum free energy structures, by the aid of an approximate energy evaluation to reduce the computer time requirements; 3) the exact calculation and refinement of the energies. A method to introduce the experimental data and reach an arrangement between them and the free energy minimization criterion is shown. In order to demonstrate the confidence of the program a test on four RNA sequences is performed. The method has computer time requirement proportional to N2, where N is the length of the sequence and retrieves a set of optimal free energy structures.  相似文献   

15.
A complete set of nearest neighbor parameters to predict the enthalpy change of RNA secondary structure formation was derived. These parameters can be used with available free energy nearest neighbor parameters to extend the secondary structure prediction of RNA sequences to temperatures other than 37°C. The parameters were tested by predicting the secondary structures of sequences with known secondary structure that are from organisms with known optimal growth temperatures. Compared with the previous set of enthalpy nearest neighbor parameters, the sensitivity of base pair prediction improved from 65.2 to 68.9% at optimal growth temperatures ranging from 10 to 60°C. Base pair probabilities were predicted with a partition function and the positive predictive value of structure prediction is 90.4% when considering the base pairs in the lowest free energy structure with pairing probability of 0.99 or above. Moreover, a strong correlation is found between the predicted melting temperatures of RNA sequences and the optimal growth temperatures of the host organism. This indicates that organisms that live at higher temperatures have evolved RNA sequences with higher melting temperatures.  相似文献   

16.
Efficient siRNA selection using hybridization thermodynamics   总被引:1,自引:1,他引:0       下载免费PDF全文
Small interfering RNA (siRNA) are widely used to infer gene function. Here, insights in the equilibrium of siRNA-target hybridization are used for selection of efficient siRNA. The accessibilities of siRNA and target mRNA for hybridization, as measured by folding free energy change, are shown to be significantly correlated with efficacy. For this study, a partition function calculation that considers all possible secondary structures is used to predict target site accessibility; a significant improvement over calculations that consider only the predicted lowest free energy structure or a set of low free energy structures. The predicted thermodynamic features, in addition to siRNA sequence features, are used as input for a support vector machine that selects functional siRNA. The method works well for predicting efficient siRNA (efficacy >70%) in a large siRNA data set from Novartis. The positive predictive value (percentage of sites predicted to be efficient for silencing that are) is as high as 87.6%. The sensitivity and specificity are 22.7 and 96.5%, respectively. When tested on data from different sources, the positive predictive value increased 8.1% by adding equilibrium terms to 25 local sequence features. Prediction of hybridization affinity using partition functions is now available in the RNAstructure software package.  相似文献   

17.
This article describes the latest version of an RNA folding algorithm that predicts both optimal and suboptimal solutions based on free energy minimization. A number of RNA's with known structures deduced from comparative sequence analysis are folded to test program performance. The group of solutions obtained for each molecule is analysed to determine how many of the known helixes occur in the optimal solution and in the best suboptimal solution. In most cases, a structure about 80% correct is found with a free energy within 2% of the predicted lowest free energy structure.  相似文献   

18.
A computer program is presented which determines the secondary structure of linear RNA molecules by simulating a hypothetical process of folding. This process implies the concept of 'nucleation centres', regions in RNA which locally trigger the folding. During the simulation, the RNA is allowed to fold into pseudoknotted structures, unlike all other programs predicting RNA secondary structure. The simulation uses published, experimentally determined free energy values for nearest neighbour base pair stackings and loop regions, except for new extrapolated values for loops larger than seven nucleotides. The free energy value for a loop arising from pseudoknot formation is set to a single, estimated value of 4.2 kcal/mole. Especially in the case of long RNA sequences, our program appears superior to other secondary structure predicting programs described so far, as tests on tRNAs, the LSU intron of Tetrahymena thermophila and a number of plant viral RNAs show. In addition, pseudoknotted structures are often predicted successfully. The program is written in mainframe APL and is adapted to run on IBM compatible PCs, Atari ST and Macintosh personal computers. On an 8 MHz 8088 standard PC without coprocessor, using STSC APL, it folds a sequence of 700 nucleotides in one and a half hour.  相似文献   

19.
Prediction of RNA secondary structure based on helical regions distribution   总被引:5,自引:0,他引:5  
MOTIVATION: RNAs play an important role in many biological processes and knowing their structure is important in understanding their function. Due to difficulties in the experimental determination of RNA secondary structure, the methods of theoretical prediction for known sequences are often used. Although many different algorithms for such predictions have been developed, this problem has not yet been solved. It is thus necessary to develop new methods for predicting RNA secondary structure. The most-used at present is Zuker's algorithm which can be used to determine the minimum free energy secondary structure. However many RNA secondary structures verified by experiments are not consistent with the minimum free energy secondary structures. In order to solve this problem, a method used to search a group of secondary structures whose free energy is close to the global minimum free energy was developed by Zuker in 1989. When considering a group of secondary structures, if there is no experimental data, we cannot tell which one is better than the others. This case also occurs in combinatorial and heuristic methods. These two kinds of methods have several weaknesses. Here we show how the central limit theorem can be used to solve these problems. RESULTS: An algorithm for predicting RNA secondary structure based on helical regions distribution is presented, which can be used to find the most probable secondary structure for a given RNA sequence. It consists of three steps. First, list all possible helical regions. Second, according to central limit theorem, estimate the occurrence probability of every helical region based on the Monte Carlo simulation. Third, add the helical region with the biggest probability to the current structure and eliminate the helical regions incompatible with the current structure. The above processes can be repeated until no more helical regions can be added. Take the current structure as the final RNA secondary structure. In order to demonstrate the confidence of the program, a test on three RNA sequences: tRNAPhe, Pre-tRNATyr, and Tetrahymena ribosomal RNA intervening sequence, is performed. AVAILABILITY: The program is written in Turbo Pascal 7.0. The source code is available upon request. CONTACT: Wujj@nic.bmi.ac.cn or Liwj@mail.bmi.ac.cn   相似文献   

20.
Franc Avbelj  John Moult 《Proteins》1995,23(2):129-141
Experimental evidence and theoretical models both suggest that protein folding begins by specific short regions of the polypeptide chain intermittently assuming conformations close to their final ones. The independent folding properties and small size of these folding initiation sites make them suitable subjects for computational methods aimed at deriving structure from sequence. We have used a torsion space Monte Carlo procedure together with an all-atom free energy function to investigate the folding of a set of such sites. The free energy function is derived by a potential of mean force analysis of experimental protein structures. The most important contributions to the total free energy are the local main chain electrostatics, main chain hydrogen bonds, and the burial of nonpolar area. Six proposed independent folding units and four control peptides 11–14 residues long have been investigated. Thirty Monte Carlo simulations were performed on each peptide, starting from different random conformations. Five of the six folding units adopted conformations close to the experimental ones in some of the runs. None of the controls did so, as expected. The generated conformations which are close to the experimental ones have among the lowest free energies encountered, although some less native like low free energy conformations were also found. The effectiveness of the method on these peptides, which have a wide variety of experimental conformations, is encouraging in two ways: First, it provides independent evidence that these regions of the sequences are able to adopt native like conformations early in folding, and therefore are most probably key components of the folding pathways. Second, it demonstrates that available simulation methods and free energy functions are able to produce reasonably accurate structures. Extensions of the methods to the folding of larger portions of proteins are suggested. © 1995 Wiley-Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号