共查询到20条相似文献,搜索用时 0 毫秒
1.
A computer program written in C++ has been developed which can detect all potential H-type RNA pseudoknots within any givenRNA sequence. There is no limit on the length of the input sequence. A validation run of the program using the full-length (8173nt) genomic mRNA of simian retrovirus type-1 (SRV-1) identifies the established -1 frameshift stimulating pseudokont at the gagprojunction as the most stable pseudoknot within the genomic mRNA. 相似文献
2.
3.
RNA secondary structure is often predicted from sequence by free energy minimization. Over the past two years, advances have been made in the estimation of folding free energy change, the mapping of secondary structure and the implementation of computer programs for structure prediction. The trends in computer program development are: efficient use of experimental mapping of structures to constrain structure prediction; use of statistical mechanics to improve the fidelity of structure prediction; inclusion of pseudoknots in secondary structure prediction; and use of two or more homologous sequences to find a common structure. 相似文献
4.
5.
Because of the availability of an abundance of RNA sequence information, the ability to rapidly and accurately predict the secondary structure of RNA from sequence is becoming increasingly important. A common method for predicting RNA secondary structure from sequence is free energy minimization. Therefore, accurate free energy contributions for every RNA secondary structure motif are necessary for accurate secondary structure predictions. Tandem mismatches are prevalent in naturally occurring sequences and are biologically important. A common method for predicting the stability of a sequence asymmetric tandem mismatch relies on the stabilities of the two corresponding sequence symmetric tandem mismatches [Mathews, D. H., Sabina, J., Zuker, M., and Turner, D. H. (1999) J. Mol. Biol. 288, 911-940]. To improve the prediction of sequence asymmetric tandem mismatches, the experimental thermodynamic parameters for the 22 previously unmeasured sequence symmetric tandem mismatches are reported. These new data, however, do not improve prediction of the free energy contributions of sequence asymmetric tandem mismatches. Therefore, a new model, independent of sequence symmetric tandem mismatch free energies, is proposed. This model consists of two penalties to account for destabilizing tandem mismatches, two bonuses to account for stabilizing tandem mismatches, and two penalties to account for A-U and G-U adjacent base pairs. This model improves the prediction of asymmetric tandem mismatch free energy contributions and is likely to improve the prediction of RNA secondary structure from sequence. 相似文献
6.
Two loci encoding human U4 RNA, designated U4/7 and U4/14, have been isolated and sequenced. Both are pseudogenes in that their sequences do not match any identified human U4 RNA species perfectly. The U4/7 locus harbours a full-length pseudogene of 144 bp with eight base substitutions in the structural region. This pseudogene might be derived from a hitherto unidentified human U4 RNA gene. The second locus, U4/14, has a complex structure; the structural sequence of a U4 gene has apparently been integrated into an Alu sequence. 相似文献
7.
Nebel ME Scheid A 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2011,8(6):1468-1482
There are two custom ways for predicting RNA secondary structures: minimizing the free energy of a conformation according to a thermodynamic model and maximizing the probability of a folding according to a stochastic model. In most cases, stochastic grammars are used for the latter alternative applying the maximum likelihood principle for determining a grammar's probabilities. In this paper, building on such a stochastic model, we will analyze the expected minimum free energy of an RNA molecule according to Turner's energy rules. Even if the parameters of our grammar are chosen with respect to structural properties of native molecules only (and therefore, independent of molecules' free energy), we prove formulae for the expected minimum free energy and the corresponding variance as functions of the molecule's size which perfectly fit the native behavior of free energies. This gives proof for a high quality of our stochastic model making it a handy tool for further investigations. In fact, the stochastic model for RNA secondary structures presented in this work has, for example, been used as the basis of a new algorithm for the (nonuniform) generation of random RNA secondary structures. 相似文献
8.
Predicting a set of minimal free energy RNA secondary structures common to two sequences 总被引:5,自引:0,他引:5
Mathews DH 《Bioinformatics (Oxford, England)》2005,21(10):2246-2253
MOTIVATION: Function derives from structure, therefore, there is need for methods to predict functional RNA structures. RESULTS: The Dynalign algorithm, which predicts the lowest free energy secondary structure common to two unaligned RNA sequences, is extended to the prediction of a set of low-energy structures. Dot plots can be drawn to show all base pairs in structures within an energy increment. Dynalign predicts more well-defined structures than structure prediction using a single sequence; in 5S rRNA sequences, the average number of base pairs in structures with energy within 20% of the lowest energy structure is 317 using Dynalign, but 569 using a single sequence. Structure prediction with Dynalign can also be constrained according to experiment or comparative analysis. The accuracy, measured as sensitivity and positive predictive value, of Dynalign is greater than predictions with a single sequence. AVAILABILITY: Dynalign can be downloaded at http://rna.urmc.rochester.edu 相似文献
9.
Torsional rigidity of DNA and length dependence of the free energy of DNA supercoiling 总被引:44,自引:0,他引:44
By analyzing the Boltzmann populations of DNA topoisomers that differ only in their linking numbers, the dependence of the free energy delta G tau of DNA supercoiling on the linking number alpha has been determined for DNA rings as small as 200 base-pairs (bp) in length. All experimental data can be fitted by the relation delta G tau = K (alpha-alpha)2, where alpha is a constant for a given DNA at a given set of conditions and K is a DNA length-dependent proportionality constant. For DNA rings with length N larger than 2000 bp, K is inversely proportional to N and the product NK is nearly a constant around 1150 RT X bp. For rings smaller than 2000 bp NK increases steadily with decreasing N; for a 200 bp ring NK is 3900 RT X bp. The increase in NK when N decreases can be interpreted as a result of the decrease in the contribution of the fluctuation in the writhing number to the equilibrium distribution in alpha. Assuming that the writhing contribution approaches zero for DNA rings 200 bp in size, the torsional rigidity of the DNA double helix is calculated to be 2.9 X 10(-19) erg cm. In addition, the large value of K for the small circles allows precise calculation of the helical repeat of DNA. For the 210 bp rings, the repeat is measured to be 10.54 bp. 相似文献
10.
A set of free energy values is suggested for RNA H-pseudoknot loops. The parameters are adjusted to be consistent with the theory of polymer thermodynamics and known data on pseudoknots. The values can be used for estimates of pseudoknot stabilities and computer predictions of RNA structures. 相似文献
11.
Accurate prediction of RNA pseudoknotted secondary structures from the base sequence is a challenging computational problem. Since prediction algorithms rely on thermodynamic energy models to identify low-energy structures, prediction accuracy relies in large part on the quality of free energy change parameters. In this work, we use our earlier constraint generation and Boltzmann likelihood parameter estimation methods to obtain new energy parameters for two energy models for secondary structures with pseudoknots, namely, the Dirks–Pierce (DP) and the Cao–Chen (CC) models. To train our parameters, and also to test their accuracy, we create a large data set of both pseudoknotted and pseudoknot-free secondary structures. In addition to structural data our training data set also includes thermodynamic data, for which experimentally determined free energy changes are available for sequences and their reference structures. When incorporated into the HotKnots prediction algorithm, our new parameters result in significantly improved secondary structure prediction on our test data set. Specifically, the prediction accuracy when using our new parameters improves from 68% to 79% for the DP model, and from 70% to 77% for the CC model. 相似文献
12.
13.
Nodavirus coat protein imposes dodecahedral RNA structure independent of nucleotide sequence and length 下载免费PDF全文
Tihova M Dryden KA Le TV Harvey SC Johnson JE Yeager M Schneemann A 《Journal of virology》2004,78(6):2897-2905
The nodavirus Flock house virus (FHV) has a bipartite, positive-sense RNA genome that is packaged into an icosahedral particle displaying T=3 symmetry. The high-resolution X-ray structure of FHV has shown that 10 bp of well-ordered, double-stranded RNA are located at each of the 30 twofold axes of the virion, but it is not known which portions of the genome form these duplex regions. The regular distribution of double-stranded RNA in the interior of the virus particle indicates that large regions of the encapsidated genome are engaged in secondary structure interactions. Moreover, the RNA is restricted to a topology that is unlikely to exist during translation or replication. We used electron cryomicroscopy and image reconstruction to determine the structure of four types of FHV particles that differed in RNA and protein content. RNA-capsid interactions were primarily mediated via the N and C termini, which are essential for RNA recognition and particle assembly. A substantial fraction of the packaged nucleic acid, either viral or heterologous, was organized as a dodecahedral cage of duplex RNA. The similarity in tertiary structure suggests that RNA folding is independent of sequence and length. Computational modeling indicated that RNA duplex formation involves both short-range and long-range interactions. We propose that the capsid protein is able to exploit the plasticity of the RNA secondary structures, capturing those that are compatible with the geometry of the dodecahedral cage. 相似文献
14.
Reinert G Waterman MS 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2007,4(1):153-156
A mixed Poisson approximation and a Poisson approximation for the length of the longest exact match of a random sequence across another sequence are provided, where the match is required to start at position 1 in the first sequence. This problem arises when looking for suitable anchors in whole genome alignments. 相似文献
15.
We develop a novel method of asserting the similarity between two biological sequences without the need for alignment. The proposed method uses free energy of nearest-neighbor interactions as a simple measure of dissimilarity. It is used to perform a search for similarities of a query sequence against three complex datasets. The sensitivity and selectivity are computed and evaluated and the performance of the proposed distance measure is compared. Real data analysis shows that is a very efficient, sensitive and high-selective algorithm in comparing large dataset of DNA sequences. 相似文献
16.
There are two crucial problems with statistical measures for sequence comparison: overlapping structures and background information
of words in biological sequences. Word normalization in improved composition vector method took into account these problems
and achieved better performance in evolutionary analysis. The word normalization is desirable, but not sufficient, because
it assumes that the four bases A, C, T, and G occur randomly with equal chance. This paper proposed an improved word normalization
which uses Markov model to estimate exact k-word distribution according to observed biological sequence and thus has the ability to adjust the background information
of the k-word frequencies in biological sequences. The improved word normalization was tested with three experiments and compared
with the existing word normalization. The experiment results confirm that the improved word normalization using Markov model
to estimate the exact k-word distribution in biological sequences is more efficient. 相似文献
17.
A rugged free energy landscape separates multiple functional RNA folds throughout denaturation 总被引:1,自引:0,他引:1 下载免费PDF全文
The dynamic mechanisms by which RNAs acquire biologically functional structures are of increasing importance to the rapidly expanding fields of RNA therapeutics and biotechnology. Large energy barriers separating misfolded and functional states arising from alternate base pairing are a well-appreciated characteristic of RNA. In contrast, it is typically assumed that functionally folded RNA occupies a single native basin of attraction that is free of deeply dividing energy barriers (ergodic hypothesis). This assumption is widely used as an implicit basis to interpret experimental ensemble-averaged data. Here, we develop an experimental approach to isolate persistent sub-populations of a small RNA enzyme and show by single molecule fluorescence resonance energy transfer (smFRET), biochemical probing and high-resolution mass spectrometry that commitment to one of several catalytically active folds occurs unexpectedly high on the RNA folding energy landscape, resulting in partially irreversible folding. Our experiments reveal the retention of molecular heterogeneity following the complete loss of all native secondary and tertiary structure. Our results demonstrate a surprising longevity of molecular heterogeneity and advance our current understanding beyond that of non-functional misfolds of RNA kinetically trapped on a rugged folding-free energy landscape. 相似文献
18.
We describe a computational method for the prediction of RNA secondary structure that uses a combination of free energy and comparative sequence analysis strategies. Using a homology-based sequence alignment as a starting point, all favorable pairings with respect to the Turner energy function are identified. Each potentially paired region within a multiple sequence alignment is scored using a function that combines both predicted free energy and sequence covariation with optimized weightings. High scoring regions are ranked and sequentially incorporated to define a growing secondary structure. Using a single set of optimized parameters, it is possible to accurately predict the foldings of several test RNAs defined previously by extensive phylogenetic and experimental data (including tRNA, 5 S rRNA, SRP RNA, tmRNA, and 16 S rRNA). The algorithm correctly predicts approximately 80% of the secondary structure. A range of parameters have been tested to define the minimal sequence information content required to accurately predict secondary structure and to assess the importance of individual terms in the prediction scheme. This analysis indicates that prediction accuracy most strongly depends upon covariational information and only weakly on the energetic terms. However, relatively few sequences prove sufficient to provide the covariational information required for an accurate prediction. Secondary structures can be accurately defined by alignments with as few as five sequences and predictions improve only moderately with the inclusion of additional sequences. 相似文献
19.
Jian Zhang Joseph Dundas Ming Lin Rong Chen Wei Wang Jie Liang 《RNA (New York, N.Y.)》2009,15(12):2248-2263
Accurate free energy estimation is essential for RNA structure prediction. The widely used Turner''s energy model works well for nested structures. For pseudoknotted RNAs, however, there is no effective rule for estimation of loop entropy and free energy. In this work we present a new free energy estimation method, termed the pseudoknot predictor in three-dimensional space (pk3D), which goes beyond Turner''s model. Our approach treats nested and pseudoknotted structures alike in one unifying physical framework, regardless of how complex the RNA structures are. We first test the ability of pk3D in selecting native structures from a large number of decoys for a set of 43 pseudoknotted RNA molecules, with lengths ranging from 23 to 113. We find that pk3D performs slightly better than the Dirks and Pierce extension of Turner''s rule. We then test pk3D for blind secondary structure prediction, and find that pk3D gives the best sensitivity and comparable positive predictive value (related to specificity) in predicting pseudoknotted RNA secondary structures, when compared with other methods. A unique strength of pk3D is that it also generates spatial arrangement of structural elements of the RNA molecule. Comparison of three-dimensional structures predicted by pk3D with the native structure measured by nuclear magnetic resonance or X-ray experiments shows that the predicted spatial arrangement of stems and loops is often similar to that found in the native structure. These close-to-native structures can be used as starting points for further refinement to derive accurate three-dimensional structures of RNA molecules, including those with pseudoknots. 相似文献
20.
Shippy R Fulmer-Smentek S Jensen RV Jones WD Wolber PK Johnson CD Pine PS Boysen C Guo X Chudin E Sun YA Willey JC Thierry-Mieg J Thierry-Mieg D Setterquist RA Wilson M Lucas AB Novoradovskaya N Papallo A Turpaz Y Baker SC Warrington JA Shi L Herman D 《Nature biotechnology》2006,24(9):1123-1131
We have assessed the utility of RNA titration samples for evaluating microarray platform performance and the impact of different normalization methods on the results obtained. As part of the MicroArray Quality Control project, we investigated the performance of five commercial microarray platforms using two independent RNA samples and two titration mixtures of these samples. Focusing on 12,091 genes common across all platforms, we determined the ability of each platform to detect the correct titration response across the samples. Global deviations from the response predicted by the titration ratios were observed. These differences could be explained by variations in relative amounts of messenger RNA as a fraction of total RNA between the two independent samples. Overall, both the qualitative and quantitative correspondence across platforms was high. In summary, titration samples may be regarded as a valuable tool, not only for assessing microarray platform performance and different analysis methods, but also for determining some underlying biological features of the samples. 相似文献