共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
Depending on their specific structures, noncoding RNAs (ncRNAs) play important roles in many biological processes. Interest in developing new topological indices based on RNA graphs has been revived in recent years, as such indices can be used to compare, identify and classify RNAs. Although the topological indices presented before characterize the main topological features of RNA secondary structures, information on RNA structural details is ignored to some degree. Therefore, it is necessity to identify topological features with low degeneracy based on complete and fine-grained RNA graphical representations. 相似文献2.
We describe a computational method for the prediction of RNA secondary structure that uses a combination of free energy and comparative sequence analysis strategies. Using a homology-based sequence alignment as a starting point, all favorable pairings with respect to the Turner energy function are identified. Each potentially paired region within a multiple sequence alignment is scored using a function that combines both predicted free energy and sequence covariation with optimized weightings. High scoring regions are ranked and sequentially incorporated to define a growing secondary structure. Using a single set of optimized parameters, it is possible to accurately predict the foldings of several test RNAs defined previously by extensive phylogenetic and experimental data (including tRNA, 5 S rRNA, SRP RNA, tmRNA, and 16 S rRNA). The algorithm correctly predicts approximately 80% of the secondary structure. A range of parameters have been tested to define the minimal sequence information content required to accurately predict secondary structure and to assess the importance of individual terms in the prediction scheme. This analysis indicates that prediction accuracy most strongly depends upon covariational information and only weakly on the energetic terms. However, relatively few sequences prove sufficient to provide the covariational information required for an accurate prediction. Secondary structures can be accurately defined by alignments with as few as five sequences and predictions improve only moderately with the inclusion of additional sequences. 相似文献
3.
Dandjinou AT Lévesque N Larose S Lucier JF Abou Elela S Wellinger RJ 《Current biology : CB》2004,14(13):1148-1158
BACKGROUND: Telomerase is a ribonucleoprotein complex whose RNA moiety dictates the addition of specific simple sequences onto chromosomes ends. While relevant for certain human genetic diseases, the contribution of the essential telomerase RNA to RNP assembly still remains unclear. Phylogenetic analyses of vertebrate and ciliate telomerase RNAs revealed conserved elements that potentially organize protein subunits for RNP function. In contrast, the yeast telomerase RNA could not be fitted to any known structural model, and the limited number of known sequences from Saccharomyces species did not permit the prediction of a yeast specific conserved structure. RESULTS: We cloned and analyzed the complete telomerase RNA loci (TLC1) from all known Saccharomyces species belonging to the "sensu stricto" group. Complementation analyses in S. cerevisiae and end mappings of mature RNAs ensured the relevance of the cloned sequences. By using phylogenetic comparative analysis coupled with in vitro enzymatic probing, we derived a secondary structure prediction of the Saccharomyces cerevisiae TLC1 RNA. This conserved secondary structure prediction includes a central domain that is likely to orchestrate DNA synthesis and at least two accessory domains important for RNA stability and telomerase recruitment. The structure also reveals a potential tertiary interaction between two loops in the central core. CONCLUSIONS: The predicted secondary structure of the TLC1 RNA of S. cerevisiae reveals a distinct folding pattern featuring well-separated but conserved functional elements. The predicted structure now allows for a detailed and rationally designed study to the structure-function relationships within the telomerase RNP-complex in a genetically tractable system. 相似文献
4.
MOTIVATION: RNAs play an important role in many biological processes and
knowing their structure is important in understanding their function. Due
to difficulties in the experimental determination of RNA secondary
structure, the methods of theoretical prediction for known sequences are
often used. Although many different algorithms for such predictions have
been developed, this problem has not yet been solved. It is thus necessary
to develop new methods for predicting RNA secondary structure. The
most-used at present is Zuker's algorithm which can be used to determine
the minimum free energy secondary structure. However many RNA secondary
structures verified by experiments are not consistent with the minimum free
energy secondary structures. In order to solve this problem, a method used
to search a group of secondary structures whose free energy is close to the
global minimum free energy was developed by Zuker in 1989. When considering
a group of secondary structures, if there is no experimental data, we
cannot tell which one is better than the others. This case also occurs in
combinatorial and heuristic methods. These two kinds of methods have
several weaknesses. Here we show how the central limit theorem can be used
to solve these problems. RESULTS: An algorithm for predicting RNA secondary
structure based on helical regions distribution is presented, which can be
used to find the most probable secondary structure for a given RNA
sequence. It consists of three steps. First, list all possible helical
regions. Second, according to central limit theorem, estimate the
occurrence probability of every helical region based on the Monte Carlo
simulation. Third, add the helical region with the biggest probability to
the current structure and eliminate the helical regions incompatible with
the current structure. The above processes can be repeated until no more
helical regions can be added. Take the current structure as the final RNA
secondary structure. In order to demonstrate the confidence of the program,
a test on three RNA sequences: tRNAPhe, Pre-tRNATyr, and Tetrahymena
ribosomal RNA intervening sequence, is performed. AVAILABILITY: The program
is written in Turbo Pascal 7.0. The source code is available upon request.
CONTACT: Wujj@nic.bmi.ac.cn or Liwj@mail.bmi.ac.cn
相似文献
5.
Accurate prediction of RNA pseudoknotted secondary structures from the base sequence is a challenging computational problem. Since prediction algorithms rely on thermodynamic energy models to identify low-energy structures, prediction accuracy relies in large part on the quality of free energy change parameters. In this work, we use our earlier constraint generation and Boltzmann likelihood parameter estimation methods to obtain new energy parameters for two energy models for secondary structures with pseudoknots, namely, the Dirks–Pierce (DP) and the Cao–Chen (CC) models. To train our parameters, and also to test their accuracy, we create a large data set of both pseudoknotted and pseudoknot-free secondary structures. In addition to structural data our training data set also includes thermodynamic data, for which experimentally determined free energy changes are available for sequences and their reference structures. When incorporated into the HotKnots prediction algorithm, our new parameters result in significantly improved secondary structure prediction on our test data set. Specifically, the prediction accuracy when using our new parameters improves from 68% to 79% for the DP model, and from 70% to 77% for the CC model. 相似文献
6.
Kishore J Doshi Jamie J Cannone Christian W Cobaugh Robin R Gutell 《BMC bioinformatics》2004,5(1):1-22
Background
A detailed understanding of an RNA's correct secondary and tertiary structure is crucial to understanding its function and mechanism in the cell. Free energy minimization with energy parameters based on the nearest-neighbor model and comparative analysis are the primary methods for predicting an RNA's secondary structure from its sequence. Version 3.1 of Mfold has been available since 1999. This version contains an expanded sequence dependence of energy parameters and the ability to incorporate coaxial stacking into free energy calculations. We test Mfold 3.1 by performing the largest and most phylogenetically diverse comparison of rRNA and tRNA structures predicted by comparative analysis and Mfold, and we use the results of our tests on 16S and 23S rRNA sequences to assess the improvement between Mfold 2.3 and Mfold 3.1.Results
The average prediction accuracy for a 16S or 23S rRNA sequence with Mfold 3.1 is 41%, while the prediction accuracies for the majority of 16S and 23S rRNA structures tested are between 20% and 60%, with some having less than 20% prediction accuracy. The average prediction accuracy was 71% for 5S rRNA and 69% for tRNA. The majority of the 5S rRNA and tRNA sequences have prediction accuracies greater than 60%. The prediction accuracy of 16S rRNA base-pairs decreases exponentially as the number of nucleotides intervening between the 5' and 3' halves of the base-pair increases.Conclusion
Our analysis indicates that the current set of nearest-neighbor energy parameters in conjunction with the Mfold folding algorithm are unable to consistently and reliably predict an RNA's correct secondary structure. For 16S or 23S rRNA structure prediction, Mfold 3.1 offers little improvement over Mfold 2.3. However, the nearest-neighbor energy parameters do work well for shorter RNA sequences such as tRNA or 5S rRNA, or for larger rRNAs when the contact distance between the base-pairs is less than 100 nucleotides. 相似文献7.
Prediction of alternative RNA secondary structures based on fluctuating thermodynamic parameters. 下载免费PDF全文
In this paper we present a new method for predicting a set of RNA secondary structures that are thermodynamically favored in RNA folding simulations. This method uses a large number of 'simulated energy rules' (SER) generated by perturbing the free energy parameters derived experimentally within the range of the experimental errors. The structure with the lowest free energy is computed for each SER. Structural comparisons are used to avoid multiple generation of similar structures. Computed structures are evaluated using the energy distribution of the lowest free energy structures derived in the simulation. Predicted be graphically displayed with their occurring frequencies in the simulation by dot-plot representations. On average, about 90% of phylogenetic helixes in the known models of tRNA, Group I self-splicing intron, and Escherichia coli 16 S rRNA, were predicted using the method. 相似文献
8.
Background
RNAMute is an interactive Java application that calculates the secondary structure of all single point mutations, given an RNA sequence, and organizes them into categories according to their similarity with respect to the wild type predicted structure. The secondary structure predictions are performed using the Vienna RNA package. Several alternatives are used for the categorization of single point mutations: Vienna's RNAdistance based on dot-bracket representation, as well as tree edit distance and second eigenvalue of the Laplacian matrix based on Shapiro's coarse grain tree graph representation. 相似文献9.
Li F Zheng Q Ryvkin P Dragomir I Desai Y Aiyer S Valladares O Yang J Bambina S Sabin LR Murray JI Lamitina T Raj A Cherry S Wang LS Gregory BD 《Cell reports》2012,1(1):69-82
Highlights? RNA folding in miRNA target sites is distinct between animals ? There is a negative correlation between miRNA target-site structure and miRISC binding ? Conserved secondary structure features demarcate protein-coding regions of animal mRNAs ? There are conserved features of RNA secondary structure in animals 相似文献
10.
11.
A set of nearest neighbor parameters for predicting the enthalpy change of RNA secondary structure formation 总被引:2,自引:1,他引:2 下载免费PDF全文
A complete set of nearest neighbor parameters to predict the enthalpy change of RNA secondary structure formation was derived. These parameters can be used with available free energy nearest neighbor parameters to extend the secondary structure prediction of RNA sequences to temperatures other than 37°C. The parameters were tested by predicting the secondary structures of sequences with known secondary structure that are from organisms with known optimal growth temperatures. Compared with the previous set of enthalpy nearest neighbor parameters, the sensitivity of base pair prediction improved from 65.2 to 68.9% at optimal growth temperatures ranging from 10 to 60°C. Base pair probabilities were predicted with a partition function and the positive predictive value of structure prediction is 90.4% when considering the base pairs in the lowest free energy structure with pairing probability of 0.99 or above. Moreover, a strong correlation is found between the predicted melting temperatures of RNA sequences and the optimal growth temperatures of the host organism. This indicates that organisms that live at higher temperatures have evolved RNA sequences with higher melting temperatures. 相似文献
12.
Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. 总被引:89,自引:0,他引:89
An improved dynamic programming algorithm is reported for RNA secondary structure prediction by free energy minimization. Thermodynamic parameters for the stabilities of secondary structure motifs are revised to include expanded sequence dependence as revealed by recent experiments. Additional algorithmic improvements include reduced search time and storage for multibranch loop free energies and improved imposition of folding constraints. An extended database of 151,503 nt in 955 structures? determined by comparative sequence analysis was assembled to allow optimization of parameters not based on experiments and to test the accuracy of the algorithm. On average, the predicted lowest free energy structure contains 73 % of known base-pairs when domains of fewer than 700 nt are folded; this compares with 64 % accuracy for previous versions of the algorithm and parameters. For a given sequence, a set of 750 generated structures contains one structure that, on average, has 86 % of known base-pairs. Experimental constraints, derived from enzymatic and flavin mononucleotide cleavage, improve the accuracy of structure predictions. 相似文献
13.
Zsuzsanna Sükösd Bjarne Knudsen Morten Værum Jørgen Kjems Ebbe S Andersen 《BMC bioinformatics》2011,12(1):103
Background
The prediction of the structure of large RNAs remains a particular challenge in bioinformatics, due to the computational complexity and low levels of accuracy of state-of-the-art algorithms. The pfold model couples a stochastic context-free grammar to phylogenetic analysis for a high accuracy in predictions, but the time complexity of the algorithm and underflow errors have prevented its use for long alignments. Here we present PPfold, a multithreaded version of pfold, which is capable of predicting the structure of large RNA alignments accurately on practical timescales. 相似文献14.
Page RD 《Bioinformatics (Oxford, England)》2000,16(11):1042-1043
SUMMARY: Circles is a program for inferring RNA secondary structure using maximum weight matching. The program can read in an alignment in FASTA, ClustalW, or NEXUS format, compute a maximum weight matching, and export one or more secondary structures in various file formats. AVAILABILITY: The program is available at no cost from http://taxonomy.zoology.gla.ac.uk/rod/circles/ and requires Windows 95/98/NT. CONTACT: r.page@bio.gla.ac.uk 相似文献
15.
RNA secondary structures are important in many biological processes and efficient structure prediction can give vital directions for experimental investigations. Many available programs for RNA secondary structure prediction only use a single sequence at a time. This may be sufficient in some applications, but often it is possible to obtain related RNA sequences with conserved secondary structure. These should be included in structural analyses to give improved results. This work presents a practical way of predicting RNA secondary structure that is especially useful when related sequences can be obtained. The method improves a previous algorithm based on an explicit evolutionary model and a probabilistic model of structures. Predictions can be done on a web server at http://www.daimi.au.dk/~compbio/pfold. 相似文献
16.
A conserved secondary structure for telomerase RNA. 总被引:41,自引:0,他引:41
The RNA moiety of the ribonucleoprotein enzyme telomerase contains the template for telomeric DNA synthesis. We present a secondary structure model for telomerase RNA, derived by a phylogenetic comparative analysis of telomerase RNAs from seven tetrahymenine ciliates. The telomerase RNA genes from Tetrahymena malaccensis, T. pyriformis, T. hyperangularis, T. pigmentosa, T. hegewishii, and Glaucoma chattoni were cloned, sequenced, and compared with the previously cloned RNA gene from T. thermophila and with each other. To define secondary structures of these RNAs, homologous complementary sequences were identified by the occurrence of covariation among putative base pairs. Although their primary sequences have diverged rapidly overall, a strikingly conserved secondary structure was identified for all these telomerase RNAs. Short regions of nucleotide conservation include a block of 22 totally conserved nucleotides that contains the telomeric templating region. 相似文献
17.
The function of many RNAs depends crucially on their structure. Therefore, the design of RNA molecules with specific structural properties has many potential applications, e.g. in the context of investigating the function of biological RNAs, of creating new ribozymes, or of designing artificial RNA nanostructures. Here, we present a new algorithm for solving the following RNA secondary structure design problem: given a secondary structure, find an RNA sequence (if any) that is predicted to fold to that structure. Unlike the (pseudoknot-free) secondary structure prediction problem, this problem appears to be hard computationally. Our new algorithm, "RNA Secondary Structure Designer (RNA-SSD)", is based on stochastic local search, a prominent general approach for solving hard combinatorial problems. A thorough empirical evaluation on computationally predicted structures of biological sequences and artificially generated RNA structures as well as on empirically modelled structures from the biological literature shows that RNA-SSD substantially out-performs the best known algorithm for this problem, RNAinverse from the Vienna RNA Package. In particular, the new algorithm is able to solve structures, consistently, for which RNAinverse is unable to find solutions. The RNA-SSD software is publically available under the name of RNA Designer at the RNASoft website (www.rnasoft.ca). 相似文献
18.
Comparative analysis of secondary structure of insect mitochondrial small subunit ribosomal RNA using maximum weighted matching 总被引:2,自引:0,他引:2 下载免费PDF全文
Page RD 《Nucleic acids research》2000,28(20):3839-3845
Comparative analysis is the preferred method of inferring RNA secondary structure, but its use requires considerable expertise and manual effort. As the importance of secondary structure for accurate sequence alignment and phylogenetic analysis becomes increasingly realised, the need for secondary structure models for diverse taxonomic groups becomes more pressing. The number of available structures bears little relation to the relative diversity or importance of the different taxonomic groups. Insects, for example, comprise the largest group of animals and yet are very poorly represented in secondary structure databases. This paper explores the utility of maximum weighted matching (MWM) to help automate the process of comparative analysis by inferring secondary structure for insect mitochondrial small subunit (12S) rRNA sequences. By combining information on correlated changes in substitutions and helix dot plots, MWM can rapidly generate plausible models of secondary structure. These models can be further refined using standard comparative techniques. This paper presents a secondary structure model for insect 12S rRNA based on an alignment of 225 insect sequences and an alignment for 16 exemplar insect sequences. This alignment is used as a template for a web server that automatically generates secondary structures for insect sequences. 相似文献
19.
20.
RNA二级结构的预测算法研究已有近40年的发展历程,研究假结也将近30年的历史。在此期间,RNA二级结构的预测算法取得了很大进步,但假结预测的正确率依然偏低。其中启发式算法能较好地处理复杂假结,使其成为率先解决假结预测难题可能性最大的算法。迄今为止,未见系统地专门总结预测假结的各种启发式算法及其优点与缺点的报道。本文详细介绍了近年来国际上流行的贪婪算法、遗传算法、ILM算法、HotKnots算法以及FlexStem算法等五种算法,并总结分析了每种算法的优点与不足,最后提出在未来一段时期内,利用启发式算法提高假结预测准确度应从建立更完善的假结模型、加入更多影响因素、借鉴不同算法的优势等方面入手。为含假结RNA二级结构预测的研究提供参考。 相似文献