首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We analyze the secondary structure of two expansion segments (D2, D3) of the 28S ribosomal (rRNA)-encoding gene region from 527 chalcidoid wasp taxa (Hymenoptera: Chalcidoidea) representing 18 of the 19 extant families. The sequences are compared in a multiple sequence alignment, with secondary structure inferred primarily from the evidence of compensatory base changes in conserved helices of the rRNA molecules. This covariation analysis yielded 36 helices that are composed of base pairs exhibiting positional covariation. Several additional regions are also involved in hydrogen bonding, and they form highly variable base-pairing patterns across the alignment. These are identified as regions of expansion and contraction or regions of slipped-strand compensation. Additionally, 31 single-stranded locales are characterized as regions of ambiguous alignment based on the difficulty in assigning positional homology in the presence of multiple adjacent indels. Based on comparative analysis of these sequences, the largest genetic study on any hymenopteran group to date, we report an annotated secondary structural model for the D2, D3 expansion segments that will prove useful in assigning positional nucleotide homology for phylogeny reconstruction in these and closely related apocritan taxa.  相似文献   

2.
Most functional RNA molecules have characteristic structures that are highly conserved in evolution. Many of them contain pseudoknots. Here, we present a method for computing the consensus structures including pseudoknots based on alignments of a few sequences. The algorithm combines thermodynamic and covariation information to assign scores to all possible base pairs, the base pairs are chosen with the help of the maximum weighted matching algorithm. We applied our algorithm to a number of different types of RNA known to contain pseudoknots. All pseudoknots were predicted correctly and more than 85 percent of the base pairs were identified.  相似文献   

3.
王金华  骆志刚  管乃洋  严繁妹  靳新  张雯 《遗传》2007,29(7):889-897
多数RNA分子的结构在进化中是高度保守的, 其中很多包含伪结。而RNA伪结的预测一直是一个棘手问题, 很多RNA 二级结构预测算法都不能预测伪结。文章提出一种基于迭代法预测带伪结RNA 二级结构的新方法。该方法在给潜在碱基对打分时综合了热力学和协变信息, 通过基于最小自由能RNA折叠算法的多次迭代选出所有的碱基对。测试结果表明: 此方法几乎能预测到所有的伪结。与其他方法相比, 敏感度接近最优, 而特异性达到最优。  相似文献   

4.
Shang L  Xu W  Ozer S  Gutell RR 《PloS one》2012,7(6):e39383
Covariation analysis is used to identify those positions with similar patterns of sequence variation in an alignment of RNA sequences. These constraints on the evolution of two positions are usually associated with a base pair in a helix. While mutual information (MI) has been used to accurately predict an RNA secondary structure and a few of its tertiary interactions, early studies revealed that phylogenetic event counting methods are more sensitive and provide extra confidence in the prediction of base pairs. We developed a novel and powerful phylogenetic events counting method (PEC) for quantifying positional covariation with the Gutell lab's new RNA Comparative Analysis Database (rCAD). The PEC and MI-based methods each identify unique base pairs, and jointly identify many other base pairs. In total, both methods in combination with an N-best and helix-extension strategy identify the maximal number of base pairs. While covariation methods have effectively and accurately predicted RNAs secondary structure, only a few tertiary structure base pairs have been identified. Analysis presented herein and at the Gutell lab's Comparative RNA Web (CRW) Site reveal that the majority of these latter base pairs do not covary with one another. However, covariation analysis does reveal a weaker although significant covariation between sets of nucleotides that are in proximity in the three-dimensional RNA structure. This reveals that covariation analysis identifies other types of structural constraints beyond the two nucleotides that form a base pair.  相似文献   

5.
The RNA secondary structure is not confined to a system of the hairpins and can contain pseudoknots as well as topologically equivalent slipped-loop structure (SLS) conformations. A specific primary structure that directs folding to the pseudoknot or SLS is called SL-palindrome (SLP). Using a computer program for searching the SLP in the genomic sequences, 419 primary structures of large ribosomal RNAs from different kingdoms (prokaryota, eukaryota, archaebacteria) as well as plastids and mitochondria were analyzed. A universal site was found in the peptidyltransferase center (PTC) capable of folding to a pseudoknot of 48 nucleotides in length. Phylogenetic conservation of its helices (concurrent replacements with no violation of base pairing, covariation) has been demonstrated. We suggest the reversible folding-unfolding of the pseudoknot for certain stages of the ribosome functioning.  相似文献   

6.
Phylogenetic analysis of tmRNA secondary structure.   总被引:10,自引:3,他引:7       下载免费PDF全文
The bacterial tmRNA acts with dual tRNA-like and mRNA-like character to tag incomplete translation products for degradation. Comparative analysis of 17 tmRNA genes (including eight new sequences) has allowed us to deduce conserved features of the tmRNA secondary structure. Except in a segment that includes the first codon of the tag reading frame, tmRNA is highly structured, with four pseudoknots and a total of 11 conserved base pairing regions. The previously identified tRNA minihelix structure is connected by a long base paired region to a large structured domain composed of a pseudoknot, followed by the tag reading frame and a string of three rather similar pseudoknots. The conservation of numerous structural elements among diverse eubacterial species indicates that these elements have important function beyond simply forming an endonuclease-resistant link between the reading frame and the tRNA-like domain.  相似文献   

7.
RNA伪结预测是RNA研究的一个难点问题。文中提出一种基于堆积协变信息与最小自由能的RNA伪结预测方法。该方法使用已知结构的RNA比对序列(ClustalW比对和结构比对)测试此方法, 侧重考虑相邻碱基对之间相互作用形成的堆积协变信息, 并结合最小自由能方法对碱基配对综合评分, 通过逐步迭代求得含伪结的RNA二级结构。结果表明, 此方法能正确预测伪结, 其平均敏感性和特异性优于参考算法, 并且结构比对的预测性能比ClustalW比对的预测性能更加稳定。文中同时讨论了不同协变信息权重因子对预测性能的影响, 发现权重因子比值在l1: l2=5:1时, 预测性能达到最优。  相似文献   

8.
RAGA: RNA sequence alignment by genetic algorithm.   总被引:7,自引:0,他引:7       下载免费PDF全文
We describe a new approach for accurately aligning two homologous RNA sequences when the secondary structure of one of them is known. To do so we developed two software packages, called RAGA and PRAGA, which use a genetic algorithm approach to optimize the alignments. RAGA is mainly an extension of SAGA, an earlier package for multiple protein sequence alignment. In PRAGA several genetic algorithms run in parallel and exchange individual solutions. This method allows us to optimize an objective function that describes the quality of a RNA pairwise alignment, taking into account both primary and secondary structure, including pseudoknots. We report results obtained using PRAGA on nine test cases of pairs of eukaryotic small subunit rRNA sequence (nuclear and mitochondrial).  相似文献   

9.
BACKGROUND: With the ever-increasing number of sequenced RNAs and the establishment of new RNA databases, such as the Comparative RNA Web Site and Rfam, there is a growing need for accurately and automatically predicting RNA structures from multiple alignments. Since RNA secondary structure is often conserved in evolution, the well known, but underused, mutual information measure for identifying covarying sites in an alignment can be useful for identifying structural elements. This article presents MIfold, a MATLAB toolbox that employs mutual information, or a related covariation measure, to display and predict conserved RNA secondary structure (including pseudoknots) from an alignment. RESULTS: We show that MIfold can be used to predict simple pseudoknots, and that the performance can be adjusted to make it either more sensitive or more selective. We also demonstrate that the overall performance of MIfold improves with the number of aligned sequences for certain types of RNA sequences. In addition, we show that, for these sequences, MIfold is more sensitive but less selective than the related RNAalifold structure prediction program and is comparable with the COVE structure prediction package. CONCLUSION: MIfold provides a useful supplementary tool to programs such as RNA Structure Logo, RNAalifold and COVE, and should be useful for automatically generating structural predictions for databases such as Rfam.  相似文献   

10.
The RNA PK5 (GCGAUUUCUGACCGCUUUUUUGUCAG) forms a pseudoknotted structure at low temperatures and a hairpin containing an A.C opposition at higher temperatures (J. Mol. Biol. 214, 455-470 (1990)). CD and absorption spectra of PK5 were measured at several temperatures. A basis set of spectra were fit to the spectra of PK5 using a method that can provide estimates of the numbers of A.U, G.C, and G.U base pairs as well as the number of each of 11 nearest-neighbor base pairs in an RNA (Biopolymers 31, 373-384 (1991)). The fits were close, indicating that PK5 retained the A conformation in the pseudoknot structure and that the fitting technique is not hindered by pseudoknots or A.C oppositions. The results from the analysis were consistent with the pseudoknotted structure at low temperatures and with the hairpin structure at higher temperatures. We concluded that the method of spectral analysis should be useful for determining the secondary structures of other RNAs containing pseudoknots and A.C oppositions.  相似文献   

11.
The paper investigates the computational problem of predicting RNA secondary structures. The general belief is that allowing pseudoknots makes the problem hard. Existing polynomial-time algorithms are heuristic algorithms with no performance guarantee and can handle only limited types of pseudoknots. In this paper, we initiate the study of predicting RNA secondary structures with a maximum number of stacking pairs while allowing arbitrary pseudoknots. We obtain two approximation algorithms with worst-case approximation ratios of 1/2 and 1/3 for planar and general secondary structures, respectively. For an RNA sequence of n bases, the approximation algorithm for planar secondary structures runs in O(n(3)) time while that for the general case runs in linear time. Furthermore, we prove that allowing pseudoknots makes it NP-hard to maximize the number of stacking pairs in a planar secondary structure. This result is in contrast with the recent NP-hard results on psuedoknots which are based on optimizing some general and complicated energy functions.  相似文献   

12.
RNA pseudoknot prediction in energy-based models.   总被引:11,自引:0,他引:11  
RNA molecules are sequences of nucleotides that serve as more than mere intermediaries between DNA and proteins, e.g., as catalytic molecules. Computational prediction of RNA secondary structure is among the few structure prediction problems that can be solved satisfactorily in polynomial time. Most work has been done to predict structures that do not contain pseudoknots. Allowing pseudoknots introduces modeling and computational problems. In this paper we consider the problem of predicting RNA secondary structures with pseudoknots based on free energy minimization. We first give a brief comparison of energy-based methods for predicting RNA secondary structures with pseudoknots. We then prove that the general problem of predicting RNA secondary structures containing pseudoknots is NP complete for a large class of reasonable models of pseudoknots.  相似文献   

13.
MOTIVATION: Several algorithms have been developed for drawing RNA secondary structures, however none of these can be used to draw RNA pseudoknot structures. In the sense of graph theory, a drawing of RNA secondary structures is a tree, whereas a drawing of RNA pseudoknots is a graph with inner cycles within a pseudoknot as well as possible outer cycles formed between a pseudoknot and other structural elements. Thus, RNA pseudoknots are more difficult to visualize than RNA secondary structures. Since no automatic method for drawing RNA pseudoknots exists, visualizing RNA pseudoknots relies on significant amount of manual work and does not yield satisfactory results. The task of visualizing RNA pseudoknots by hand becomes more challenging as the size and complexity of the RNA pseudoknots increase. RESULTS: We have developed a new representation and an algorithm for drawing H-type pseudoknots with RNA secondary structures. Compared to existing representations of H-type pseudoknots, the new representation ensures uniform and clear drawings with no edge crossing for any H-type pseudoknots. To the best of our knowledge, this is the first algorithm for automatically drawing RNA pseudoknots with RNA secondary structures. The algorithm has been implemented in a Java program, which can be executed on any computing system. Experimental results demonstrate that the algorithm generates an aesthetically pleasing drawing of all H-type pseudoknots. The results have also shown that the drawing has high readability, enabling the user to quickly and easily recognize the whole RNA structure as well as the pseudoknots themselves.  相似文献   

14.
A number of non-coding RNA are known to contain functionally important or conserved pseudoknots. However, pseudoknotted structures are more complex than orthodox, and most methods for analyzing secondary structures do not handle them. I present here a way to decompose and represent general secondary structures which extends the tree representation of the stem-loop structure, and use this to analyze the frequency of pseudoknots in known and in random secondary structures. This comparison shows that, though a number of pseudoknots exist, they are still relatively rare and mostly of the simpler kinds. In contrast, random secondary structures tend to be heavily knotted, and the number of available structures increases dramatically when allowing pseudoknots. Therefore, methods for structure prediction and non-coding RNA identification that allow pseudoknots are likely to be much less powerful than those that do not, unless they penalize pseudoknots appropriately.  相似文献   

15.
RNA二级结构的预测算法研究已有近40年的发展历程,研究假结也将近30年的历史。在此期间,RNA二级结构的预测算法取得了很大进步,但假结预测的正确率依然偏低。其中启发式算法能较好地处理复杂假结,使其成为率先解决假结预测难题可能性最大的算法。迄今为止,未见系统地专门总结预测假结的各种启发式算法及其优点与缺点的报道。本文详细介绍了近年来国际上流行的贪婪算法、遗传算法、ILM算法、HotKnots算法以及FlexStem算法等五种算法,并总结分析了每种算法的优点与不足,最后提出在未来一段时期内,利用启发式算法提高假结预测准确度应从建立更完善的假结模型、加入更多影响因素、借鉴不同算法的优势等方面入手。为含假结RNA二级结构预测的研究提供参考。  相似文献   

16.
The traditional way to infer RNA secondary structure involves an iterative process of alignment and evaluation of covariation statistics between all positions possibly involved in basepairing. Watson-Crick basepairs typically show covariations that score well when examples of two or more possible basepairs occur. This is not necessarily the case for non-Watson-Crick basepairing geometries. For example, for sheared (trans Hoogsteen/Sugar edge) pairs, one base is highly conserved (always A or mostly A with some C or U), while the other can vary (G or A and sometimes C and U as well). RNA motifs consist of ordered, stacked arrays of non-Watson-Crick basepairs that in the secondary structure representation form hairpin or internal loops, multi-stem junctions, and even pseudoknots. Although RNA motifs occur recurrently and contribute in a modular fashion to RNA architecture, it is usually not apparent which bases interact and whether it is by edge-to-edge H-bonding or solely by stacking interactions. Using a modular sequence-analysis approach, recurrent motifs related to the sarcin-ricin loop of 23S RNA and to loop E from 5S RNA were predicted in universally conserved regions of the large ribosomal RNAs (16S- and 23S-like) before the publication of high-resolution, atomic-level structures of representative examples of 16S and 23S rRNA molecules in their native contexts. This provides the opportunity to evaluate the predictive power of motif-level sequence analysis, with the goal of automating the process for predicting RNA motifs in genomic sequences. The process of inferring structure from sequence by constructing accurate alignments is a circular one. The crucial link that allows a productive iteration of motif modeling and realignment is the comparison of the sequence variations for each putative pair with the corresponding isostericity matrix to determine which basepairs are consistent both with the sequence and the geometrical data.  相似文献   

17.
《Journal of molecular biology》2019,431(8):1592-1603
The existence of evolutionary conservation in base pairing is strong evidence for functional elements of RNA structure, although available tools for rigorous identification of structural conservation are limited. R-scape is a recently developed program for statistical prediction of pairwise covariation from sequence alignments, but it initially showed limited utility on long RNAs, especially those of eukaryotic origin. Here we show that R-scape can be adapted for a more powerful analysis of structure conservation in long RNA molecules, including mammalian lncRNAs.  相似文献   

18.
Computational tools for prediction of the secondary structure of two or more interacting nucleic acid molecules are useful for understanding mechanisms for ribozyme function, determining the affinity of an oligonucleotide primer to its target, and designing good antisense oligonucleotides, novel ribozymes, DNA code words, or nanostructures. Here, we introduce new algorithms for prediction of the minimum free energy pseudoknot-free secondary structure of two or more nucleic acid molecules, and for prediction of alternative low-energy (sub-optimal) secondary structures for two nucleic acid molecules. We provide a comprehensive analysis of our predictions against secondary structures of interacting RNA molecules drawn from the literature. Analysis of our tools on 17 sequences of up to 200 nucleotides that do not form pseudoknots shows that they have 79% accuracy, on average, for the minimum free energy predictions. When the best of 100 sub-optimal foldings is taken, the average accuracy increases to 91%. The accuracy decreases as the sequences increase in length and as the number of pseudoknots and tertiary interactions increases. Our algorithms extend the free energy minimization algorithm of Zuker and Stiegler for secondary structure prediction, and the sub-optimal folding algorithm by Wuchty et al. Implementations of our algorithms are freely available in the package MultiRNAFold.  相似文献   

19.
The language of RNA: a formal grammar that includes pseudoknots   总被引:9,自引:0,他引:9  
MOTIVATION: In a previous paper, we presented a polynomial time dynamic programming algorithm for predicting optimal RNA secondary structure including pseudoknots. However, a formal grammatical representation for RNA secondary structure with pseudoknots was still lacking. RESULTS: Here we show a one-to-one correspondence between that algorithm and a formal transformational grammar. This grammar class encompasses the context-free grammars and goes beyond to generate pseudoknotted structures. The pseudoknot grammar avoids the use of general context-sensitive rules by introducing a small number of auxiliary symbols used to reorder the strings generated by an otherwise context-free grammar. This formal representation of the residue correlations in RNA structure is important because it means we can build full probabilistic models of RNA secondary structure, including pseudoknots, and use them to optimally parse sequences in polynomial time.  相似文献   

20.
All large rRNAs possess a common core of secondary structure. However, large variations in the size of the molecule have arisen during evolution, which are accommodated over a dozen rapidly evolving domains. Most of the enlargement of the eukaryotic molecules (as compared to prokaryotes) is in fact restricted over only two of these divergent domains, which are dramatically expanded in vertebrates. We have derived secondary structure models for these two domains through a systematic comparison of all the pro- and eukaryotic sequences published so far. Within each of these domains, a subset of secondary structure elements which are specific to eukaryotes is detected. Archaebacterial-specific secondary structures can also be identified which appear to be maintained through a strong selective constraint. The relative preservation of such group-specific structures raises the issue of their potential involvement in some diversification of ribosomal functions among the three fundamental phylogenetic groups, eubacteria, archaebacteria and eukaryotes. We also show that eukaryotic ribosomal RNAs are subjected, over their entire length, to a unique type of compositional constraint which may largely differ among the major eukaryotic taxa.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号