共查询到20条相似文献,搜索用时 0 毫秒
1.
Thomas B. Hansen Morten T. Ven? Christian K. Damgaard J?rgen Kjems 《Nucleic acids research》2016,44(6):e58
CircRNAs are novel members of the non-coding RNA family. For several decades circRNAs have been known to exist, however only recently the widespread abundance has become appreciated. Annotation of circRNAs depends on sequencing reads spanning the backsplice junction and therefore map as non-linear reads in the genome. Several pipelines have been developed to specifically identify these non-linear reads and consequently predict the landscape of circRNAs based on deep sequencing datasets. Here, we use common RNAseq datasets to scrutinize and compare the output from five different algorithms; circRNA_finder, find_circ, CIRCexplorer, CIRI, and MapSplice and evaluate the levels of bona fide and false positive circRNAs based on RNase R resistance. By this approach, we observe surprisingly dramatic differences between the algorithms specifically regarding the highly expressed circRNAs and the circRNAs derived from proximal splice sites. Collectively, this study emphasizes that circRNA annotation should be handled with care and that several algorithms should ideally be combined to achieve reliable predictions. 相似文献
2.
DNA and RNA strands are employed in novel ways in the construction of nanostructures, as molecular tags in libraries of polymers and in therapeutics. New software tools for prediction and design of molecular structure will be needed in these applications. The RNAsoft suite of programs provides tools for predicting the secondary structure of a pair of DNA or RNA molecules, testing that combinatorial tag sets of DNA and RNA molecules have no unwanted secondary structure and designing RNA strands that fold to a given input secondary structure. The tools are based on standard thermodynamic models of RNA secondary structure formation. RNAsoft can be found online at http://www.RNAsoft.ca. 相似文献
3.
Cavagna A Cimarelli A Giardina I Orlandi A Parisi G Procaccini A Santagati R Stefanini F 《Mathematical biosciences》2008,214(1-2):32-37
The statistical characterization of the spatial structure of large animal groups has been very limited so far, mainly due to a lack of empirical data, especially in three dimensions (3D). Here we focus on the case of large flocks of starlings (Sturnus vulgaris) in the field. We reconstruct the 3D positions of individual birds within flocks of up to few thousands of elements. In this respect our data constitute a unique set. We perform a statistical analysis of flocks' structure by using two quantities that are new to the field of collective animal behaviour, namely the conditional density and the pair correlation function. These tools were originally developed in the context of condensed matter theory. We explain what is the meaning of these two quantities, how to measure them in a reliable way, and why they are useful in assessing the density fluctuations and the statistical correlations across the group. We show that the border-to-centre density gradient displayed by starling flocks gives rise to an anomalous behaviour of the conditional density. We also find that the pair correlation function has a structure incompatible with a crystalline arrangement of birds. In fact, our results suggest that flocks are somewhat intermediate between the liquid and the gas phase of physical systems. 相似文献
4.
Secondary structure prediction for aligned RNA sequences 总被引:19,自引:0,他引:19
Most functional RNA molecules have characteristic secondary structures that are highly conserved in evolution. Here we present a method for computing the consensus structure of a set aligned RNA sequences taking into account both thermodynamic stability and sequence covariation. Comparison with phylogenetic structures of rRNAs shows that a reliability of prediction of more than 80% is achieved for only five related sequences. As an application we show that the Early Noduline mRNA contains significant secondary structure that is supported by sequence covariation. 相似文献
5.
Kay C Wiese Alain A Deschenes Andrew G Hendriks 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2008,5(1):25-41
This paper presents two in-depth studies on RnaPredict, an evolutionary algorithm for RNA secondary structure prediction. The first study is an analysis of the performance of two thermodynamic models, Individual Nearest Neighbor (INN) and Individual Nearest Neighbor Hydrogen Bond (INN-HB). The correlation between the free energy of predicted structures and the sensitivity is analyzed for 19 RNA sequences. Although some variance is shown, there is a clear trend between a lower free energy and an increase in true positive base pairs. With increasing sequence length, this correlation generally decreases. In the second experiment, the accuracy of the predicted structures for these 19 sequences are compared against the accuracy of the structures generated by the mfold dynamic programming algorithm (DPA) and also to known structures. RnaPredict is shown to outperform the minimum free energy structures produced by mfold and has comparable performance when compared to sub-optimal structures produced by mfold. 相似文献
6.
Andronescu M Condon A Hoos HH Mathews DH Murphy KP 《Bioinformatics (Oxford, England)》2007,23(13):i19-i28
MOTIVATION: Accurate prediction of RNA secondary structure from the base sequence is an unsolved computational challenge. The accuracy of predictions made by free energy minimization is limited by the quality of the energy parameters in the underlying free energy model. The most widely used model, the Turner99 model, has hundreds of parameters, and so a robust parameter estimation scheme should efficiently handle large data sets with thousands of structures. Moreover, the estimation scheme should also be trained using available experimental free energy data in addition to structural data. RESULTS: In this work, we present constraint generation (CG), the first computational approach to RNA free energy parameter estimation that can be efficiently trained on large sets of structural as well as thermodynamic data. Our CG approach employs a novel iterative scheme, whereby the energy values are first computed as the solution to a constrained optimization problem. Then the newly computed energy parameters are used to update the constraints on the optimization function, so as to better optimize the energy parameters in the next iteration. Using our method on biologically sound data, we obtain revised parameters for the Turner99 energy model. We show that by using our new parameters, we obtain significant improvements in prediction accuracy over current state of-the-art methods. AVAILABILITY: Our CG implementation is available at http://www.rnasoft.ca/CG/. 相似文献
7.
Valencia A 《Comparative and Functional Genomics》2003,4(4):424-427
Multiple sequence alignments have much to offer to the understanding of protein structure, evolution and function. We are developing approaches to use this information in predicting protein-binding specificity, intra-protein and protein-protein interactions, and in reconstructing protein interaction networks. 相似文献
8.
Determining RNA secondary structure is important for understanding structure-function relationships and identifying potential drug targets. This paper reports the use of microarrays with heptamer 2'-O-methyl oligoribonucleotides to probe the secondary structure of an RNA and thereby improve the prediction of that secondary structure. When experimental constraints from hybridization results are added to a free-energy minimization algorithm, the prediction of the secondary structure of Escherichia coli 5S rRNA improves from 27 to 92% of the known canonical base pairs. Optimization of buffer conditions for hybridization and application of 2'-O-methyl-2-thiouridine to enhance binding and improve discrimination between AU and GU pairs are also described. The results suggest that probing RNA with oligonucleotide microarrays can facilitate determination of secondary structure. 相似文献
9.
10.
Mathews DH 《Journal of molecular biology》2006,359(3):526-532
RNA structure formation is hierarchical and, therefore, secondary structure, the sum of canonical base-pairs, can generally be predicted without knowledge of the three-dimensional structure. Secondary structure prediction algorithms evolved from predicting a single, lowest free energy structure to their current state where statistics can be determined from the thermodynamic ensemble. This article reviews the free energy minimization technique and the salient revolutions in the dynamic programming algorithm methods for secondary structure prediction. Emphasis is placed on highlighting the recently developed method, which statistically samples structures from the complete Boltzmann ensemble. 相似文献
11.
Current approaches to RNA structure prediction range from physics-based methods, which rely on thousands of experimentally measured thermodynamic parameters, to machine-learning (ML) techniques. While the methods for parameter estimation are successfully shifting toward ML-based approaches, the model parameterizations so far remained fairly constant. We study the potential contribution of increasing the amount of information utilized by RNA folding prediction models to the improvement of their prediction quality. This is achieved by proposing novel models, which refine previous ones by examining more types of structural elements, and larger sequential contexts for these elements. Our proposed fine-grained models are made practical thanks to the availability of large training sets, advances in machine-learning, and recent accelerations to RNA folding algorithms. We show that the application of more detailed models indeed improves prediction quality, while the corresponding running time of the folding algorithm remains fast. An additional important outcome of this experiment is a new RNA folding prediction model (coupled with a freely available implementation), which results in a significantly higher prediction quality than that of previous models. This final model has about 70,000 free parameters, several orders of magnitude more than previous models. Being trained and tested over the same comprehensive data sets, our model achieves a score of 84% according to the F?-measure over correctly-predicted base-pairs (i.e., 16% error rate), compared to the previously best reported score of 70% (i.e., 30% error rate). That is, the new model yields an error reduction of about 50%. Trained models and source code are available at www.cs.bgu.ac.il/?negevcb/contextfold. 相似文献
12.
13.
14.
ABSTRACT: BACKGROUND: Stochastic Context-Free Grammars (SCFGs) were applied successfully to RNA secondary structure prediction in the early 90s, and used in combination with comparative methods in the late 90s. The set of SCFGs potentially useful for RNA secondary structure prediction is very large, but a few intuitively designed grammars have remained dominant. In this paper we investigate two automatic search techniques for effective grammars - exhaustive search for very compact grammars and an evolutionary algorithm to find larger grammars. We also examine whether grammar ambiguity is as problematic to structure prediction as has been previously suggested. RESULTS: These search techniques were applied to predict RNA secondary structure on a maximal data set and revealed new and interesting grammars, though none are dramatically better than classic grammars. In general, results showed that many grammars with quite different structure could have very similar predictive ability. Many ambiguous grammars were found which were at least as effective as the best current unambiguous grammars. CONCLUSIONS: Overall the method of evolving SCFGs for RNA secondary structure prediction proved effective in finding many grammars that had strong predictive accuracy, as good or slightly better than those designed manually. Furthermore, several of the best grammars found were ambiguous, demonstrating that such grammars should not be disregarded. 相似文献
15.
MOTIVATION: Ribonucleic acid is vital in numerous stages of protein synthesis; it also possesses important functional and structural roles within the cell. The function of an RNA molecule within a particular organic system is principally determined by its structure. The current physical methods available for structure determination are time-consuming and expensive. Hence, computational methods for structure prediction are sought after. The energies involved by the formation of secondary structure elements are significantly greater than those of tertiary elements. Therefore, RNA structure prediction focuses on secondary structure. RESULTS: We present P-RnaPredict, a parallel evolutionary algorithm for RNA secondary structure prediction. The speedup provided by parallelization is investigated with five sequences, and a dramatic improvement in speedup is demonstrated, especially with longer sequences. An evaluation of the performance of P-RnaPredict in terms of prediction accuracy is made through comparison with 10 individual known structures from 3 RNA classes (5S rRNA, Group I intron 16S rRNA and 16S rRNA) and the mfold dynamic programming algorithm. P-RnaPredict is able to predict structures with higher true positive base pair counts and lower false positives than mfold on certain sequences. AVAILABILITY: P-RnaPredict is available for non-commercial usage. Interested parties should contact Kay C. Wiese (wiese@cs.sfu.ca). 相似文献
16.
A novel method is presented for joint prediction of alignment and common secondary structures of two RNA sequences. The joint consideration of common secondary structures and alignment is accomplished by structural alignment over a search space defined by the newly introduced motif called matched helical regions. The matched helical region formulation generalizes previously employed constraints for structural alignment and thereby better accommodates the structural variability within RNA families. A probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities is utilized for scoring structural alignments. Maximum a posteriori (MAP) common secondary structures, sequence alignment and joint posterior probabilities of base pairing are obtained from the model via a dynamic programming algorithm called PARTS. The advantage of the more general structural alignment of PARTS is seen in secondary structure predictions for the RNase P family. For this family, the PARTS MAP predictions of secondary structures and alignment perform significantly better than prior methods that utilize a more restrictive structural alignment model. For the tRNA and 5S rRNA families, the richer structural alignment model of PARTS does not offer a benefit and the method therefore performs comparably with existing alternatives. For all RNA families studied, the posterior probability estimates obtained from PARTS offer an improvement over posterior probability estimates from a single sequence prediction. When considering the base pairings predicted over a threshold value of confidence, the combination of sensitivity and positive predictive value is superior for PARTS than for the single sequence prediction. PARTS source code is available for download under the GNU public license at http://rna.urmc.rochester.edu. 相似文献
17.
Computational tools for prediction of the secondary structure of two or more interacting nucleic acid molecules are useful for understanding mechanisms for ribozyme function, determining the affinity of an oligonucleotide primer to its target, and designing good antisense oligonucleotides, novel ribozymes, DNA code words, or nanostructures. Here, we introduce new algorithms for prediction of the minimum free energy pseudoknot-free secondary structure of two or more nucleic acid molecules, and for prediction of alternative low-energy (sub-optimal) secondary structures for two nucleic acid molecules. We provide a comprehensive analysis of our predictions against secondary structures of interacting RNA molecules drawn from the literature. Analysis of our tools on 17 sequences of up to 200 nucleotides that do not form pseudoknots shows that they have 79% accuracy, on average, for the minimum free energy predictions. When the best of 100 sub-optimal foldings is taken, the average accuracy increases to 91%. The accuracy decreases as the sequences increase in length and as the number of pseudoknots and tertiary interactions increases. Our algorithms extend the free energy minimization algorithm of Zuker and Stiegler for secondary structure prediction, and the sub-optimal folding algorithm by Wuchty et al. Implementations of our algorithms are freely available in the package MultiRNAFold. 相似文献
18.
Shapiro BA Yingling YG Kasprzak W Bindewald E 《Current opinion in structural biology》2007,17(2):157-165
The field of RNA structure prediction has experienced significant advances in the past several years, thanks to the availability of new experimental data and improved computational methodologies. These methods determine RNA secondary structures and pseudoknots from sequence alignments, thermodynamics-based dynamic programming algorithms, genetic algorithms and combined approaches. Computational RNA three-dimensional modeling uses this information in conjunction with manual manipulation, constraint satisfaction methods, molecular mechanics and molecular dynamics. The ultimate goal of automatically producing RNA three-dimensional models from given secondary and tertiary structure data, however, is still not fully realized. Recent developments in the computational prediction of RNA structure have helped bridge the gap between RNA secondary structure prediction, including pseudoknots, and three-dimensional modeling of RNA. 相似文献
19.
Background
To understand an RNA sequence's mechanism of action, the structure must be known. Furthermore, target RNA structure is an important consideration in the design of small interfering RNAs and antisense DNA oligonucleotides. RNA secondary structure prediction, using thermodynamics, can be used to develop hypotheses about the structure of an RNA sequence. 相似文献20.
Accurate prediction of RNA pseudoknotted secondary structures from the base sequence is a challenging computational problem. Since prediction algorithms rely on thermodynamic energy models to identify low-energy structures, prediction accuracy relies in large part on the quality of free energy change parameters. In this work, we use our earlier constraint generation and Boltzmann likelihood parameter estimation methods to obtain new energy parameters for two energy models for secondary structures with pseudoknots, namely, the Dirks–Pierce (DP) and the Cao–Chen (CC) models. To train our parameters, and also to test their accuracy, we create a large data set of both pseudoknotted and pseudoknot-free secondary structures. In addition to structural data our training data set also includes thermodynamic data, for which experimentally determined free energy changes are available for sequences and their reference structures. When incorporated into the HotKnots prediction algorithm, our new parameters result in significantly improved secondary structure prediction on our test data set. Specifically, the prediction accuracy when using our new parameters improves from 68% to 79% for the DP model, and from 70% to 77% for the CC model. 相似文献