首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download.  相似文献   

2.
Recent progress in predicting RNA structure is moving towards filling the ‘gap’ in 2D RNA structure prediction where, for example, predicted internal loops often form non-canonical base pairs. This is increasingly recognized with the steady increase of known RNA 3D modules. There is a general interest in matching structural modules known from one molecule to other molecules for which the 3D structure is not known yet. We have created a pipeline, metaRNAmodules, which completely automates extracting putative modules from the FR3D database and mapping of such modules to Rfam alignments to obtain comparative evidence. Subsequently, the modules, initially represented by a graph, are turned into models for the RMDetect program, which allows to test their discriminative power using real and randomized Rfam alignments. An initial extraction of 22 495 3D modules in all PDB files results in 977 internal loop and 17 hairpin modules with clear discriminatory power. Many of these modules describe only minor variants of each other. Indeed, mapping of the modules onto Rfam families results in 35 unique locations in 11 different families. The metaRNAmodules pipeline source for the internal loop modules is available at http://rth.dk/resources/mrm.  相似文献   

3.
Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution. These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D module prediction tools and apply them on a 13-way vertebrate sequence-based alignment. We find that RNA 3D modules predicted by metaRNAmodules and JAR3D are significantly enriched in the screened windows compared to their shuffled counterparts. The initially estimated FDR of 47.0% is lowered to below 25% when certain 3D module predictions are present in the window of the 2D prediction. We discuss the implications and prospects for further development of computational strategies for detection of RNA 2D structure in genomic sequence.  相似文献   

4.
Shu Z  Bevilacqua PC 《Biochemistry》1999,38(46):15369-15379
Hairpins are the most common elements of RNA secondary structure, playing important roles in RNA tertiary architecture and forming protein binding sites.Triloops are common in a variety of naturally occurring RNA hairpins, but little is known about their thermodynamic stability. Reported here are the sequences and thermodynamic parameters for a variety of stable and unstable triloop hairpins. Temperature gradient gel electrophoresis (TGGE) can be used to separate a simple RNA combinatorial library based on thermal stability [Bevilacqua, J. M., and Bevilacqua, P. C. (1998) Biochemistry 45, 15877-15884]. Here we introduce the application of TGGE to separating and analyzing a complex RNA combinatorial library based on thermal stability, using an RNA triloop library. Several rounds of in vitro selection of an RNA triloop library were carried out using TGGE, and preferences for exceptionally stable and unstable closing base pairs and loop sequences were identified. For stable hairpins, the most common closing base pair is CG, and U-rich loop sequences are preferred. Closing base pairs of GC and UA result in moderately stable hairpins when combined with a stable loop sequence. For unstable hairpins, the most common closing base pairs are AU and UG, and U-rich loop sequences are no longer preferred. In general, the contributions of the closing base pair and loop sequence to overall hairpin stability appear to be additive. Thermodynamic parameters for individual hairpins determined by UV melting are generally consistent with outcomes from selection experiments, with hairpins containing a CG closing base pair having a DeltaDeltaG degrees (37) 2.1-2.5 kcal/mol more favorable than hairpins with other closing base pairs. Sequences and thermodynamic rules for triloop hairpins should aid in RNA structure prediction and determination of whether naturally occurring triloop hairpins are thermodynamically stable.  相似文献   

5.
RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa.  相似文献   

6.
Modular architecture is a hallmark of RNA structures, implying structural, and possibly functional, similarity among existing RNAs. To systematically delineate the existence of smaller topologies within larger structures, we develop and apply an efficient RNA secondary structure comparison algorithm using a newly developed two-dimensional RNA graphical representation. Our survey of similarity among 14 pseudoknots and subtopologies within ribosomal RNAs (rRNAs) uncovers eight pairs of structurally related pseudoknots with non-random sequence matches and reveals modular units in rRNAs. Significantly, three structurally related pseudoknot pairs have functional similarities not previously known: one pair involves the 3′ end of brome mosaic virus genomic RNA (PKB134) and the alternative hammerhead ribozyme pseudoknot (PKB173), both of which are replicase templates for viral RNA replication; the second pair involves structural elements for translation initiation and ribosome recruitment found in the viral internal ribosome entry site (PKB223) and the V4 domain of 18S rRNA (PKB205); the third pair involves 18S rRNA (PKB205) and viral tRNA-like pseudoknot (PKB134), which probably recruits ribosomes via structural mimicry and base complementarity. Additionally, we quantify the modularity of 16S and 23S rRNAs by showing that RNA motifs can be constructed from at least 210 building blocks. Interestingly, we find that the 5S rRNA and two tree modules within 16S and 23S rRNAs have similar topologies and tertiary shapes. These modules can be applied to design novel RNA motifs via build-up-like procedures for constructing sequences and folds.  相似文献   

7.
RNA structures are built from recurrent modules that can be identified by structural and comparative sequence analysis. In order to assemble sets of helices in compact architectures, modules that introduce bends and kinks are necessary. Among such modules, kink-turns form an important family that presents sequence and structural characteristics. Here, we describe an internal loop in the bacterial type A RNase P RNA that sets helices bound at the junctions exactly in the same relative positions as in kink-turns but without the structural signatures typical of kink-turns. Our work suggests that identifying a structural module in a subset of RNA sequences constitutes a strategy to identify distinct sequential motifs sharing common structural characteristics.  相似文献   

8.
One of the key issues in the theoretical prediction of RNA folding is the prediction of loop structure from the sequence. RNA loop free energies are dependent on the loop sequence content. However, most current models account only for the loop length-dependence. The previously developed “Vfold” model (a coarse-grained RNA folding model) provides an effective method to generate the complete ensemble of coarse-grained RNA loop and junction conformations. However, due to the lack of sequence-dependent scoring parameters, the method is unable to identify the native and near-native structures from the sequence. In this study, using a previously developed iterative method for extracting the knowledge-based potential parameters from the known structures, we derive a set of dinucleotide-based statistical potentials for RNA loops and junctions. A unique advantage of the approach is its ability to go beyond the the (known) native structures by accounting for the full free energy landscape, including all the nonnative folds. The benchmark tests indicate that for given loop/junction sequences, the statistical potentials enable successful predictions for the coarse-grained 3D structures from the complete conformational ensemble generated by the Vfold model. The predicted coarse-grained structures can provide useful initial folds for further detailed structural refinement.  相似文献   

9.
10.
A new approach, graph-grammars, to encode RNA tertiary structure patterns is introduced and exemplified with the classical sarcin-ricin motif. The sarcin-ricin motif is found in the stem of the crucial ribosomal loop E (also referred to as the sarcin-ricin loop), which is sensitive to the alpha-sarcin and ricin toxins. Here, we generate a graph-grammar for the sarcin-ricin motif and apply it to derive putative sequences that would fold in this motif. The biological relevance of the derived sequences is confirmed by a comparison with those found in known sarcin-ricin sites in an alignment of over 800 bacterial 23S ribosomal RNAs. The comparison raised alternative alignments in few sarcin-ricin sites, which were assessed using tertiary structure predictions and 3D modeling. The sarcin-ricin motif graph-grammar was built with indivisible nucleotide interaction cycles that were recently observed in structured RNAs. A comparison of the sequences and 3D structures of each cycle that constitute the sarcin-ricin motif gave us additional insights about RNA sequence-structure relationships. In particular, this analysis revealed the sequence space of an RNA motif depends on a structural context that goes beyond the single base pairing and base-stacking interactions.  相似文献   

11.
The natural bases of nucleic acids form a great variety of base pairs with at least two hydrogen bonds between them. They are classified in twelve main families, with the Watson–Crick family being one of them. In a given family, some of the base pairs are isosteric between them, meaning that the positions and the distances between the C1′ carbon atoms are very similar. The isostericity of Watson–Crick pairs between the complementary bases forms the basis of RNA helices and of the resulting RNA secondary structure. Several defined suites of non-Watson–Crick base pairs assemble into RNA modules that form recurrent, rather regular, building blocks of the tertiary architecture of folded RNAs. RNA modules are intrinsic to RNA architecture are therefore disconnected from a biological function specifically attached to a RNA sequence. RNA modules occur in all kingdoms of life and in structured RNAs with diverse functions. Because of chemical and geometrical constraints, isostericity between non-Watson–Crick pairs is restricted and this leads to higher sequence conservation in RNA modules with, consequently, greater difficulties in extracting 3D information from sequence analysis. Nucleic acid helices have to be recognised in several biological processes like replication or translational decoding. In polymerases and the ribosomal decoding site, the recognition occurs on the minor groove sides of the helical fragments. With the use of alternative conformations, protonated or tautomeric forms of the bases, some base pairs with Watson–Crick-like geometries can form and be stabilized. Several of these pairs with Watson–Crick-like geometries extend the concept of isostericity beyond the number of isosteric pairs formed between complementary bases. These observations set therefore limits and constraints to geometric selection in molecular recognition of complementary Watson–Crick pairs for fidelity in replication and translation processes.  相似文献   

12.
Cell-to-cell trafficking of RNA is an emerging biological principle that integrates systemic gene regulation, viral infection, antiviral response, and cell-to-cell communication. A key mechanistic question is how an RNA is specifically selected for trafficking from one type of cell into another type. Here, we report the identification of an RNA motif in Potato spindle tuber viroid (PSTVd) required for trafficking from palisade mesophyll to spongy mesophyll in Nicotiana benthamiana leaves. This motif, called loop 6, has the sequence 5'-CGA-3'...5'-GAC-3' flanked on both sides by cis Watson-Crick G/C and G/U wobble base pairs. We present a three-dimensional (3D) structural model of loop 6 that specifies all non-Watson-Crick base pair interactions, derived by isostericity-based sequence comparisons with 3D RNA motifs from the RNA x-ray crystal structure database. The model is supported by available chemical modification patterns, natural sequence conservation/variations in PSTVd isolates and related species, and functional characterization of all possible mutants for each of the loop 6 base pairs. Our findings and approaches have broad implications for studying the 3D RNA structural motifs mediating trafficking of diverse RNA species across specific cellular boundaries and for studying the structure-function relationships of RNA motifs in other biological processes.  相似文献   

13.
RNA is known to be involved in several cellular processes; however, it is only active when it is folded into its correct 3D conformation. The folding, bending and twisting of an RNA molecule is dependent upon the multitude of canonical and non-canonical secondary structure motifs. These motifs contribute to the structural complexity of RNA but also serve important integral biological functions, such as serving as recognition and binding sites for other biomolecules or small ligands. One of the most prevalent types of RNA secondary structure motifs are single mismatches, which occur when two canonical pairs are separated by a single non-canonical pair. To determine sequence–structure relationships and to identify structural patterns, we have systematically located, annotated and compared all available occurrences of the 30 most frequently occurring single mismatch-nearest neighbor sequence combinations found in experimentally determined 3D structures of RNA-containing molecules deposited into the Protein Data Bank. Hydrogen bonding, stacking and interaction of nucleotide edges for the mismatched and nearest neighbor base pairs are described and compared, allowing for the identification of several structural patterns. Such a database and comparison will allow researchers to gain insight into the structural features of unstudied sequences and to quickly look-up studied sequences.  相似文献   

14.
L Odell  V Huang  M Jakacka    T Pan 《Nucleic acids research》1998,26(16):3717-3723
The ribozyme from bacterial ribonuclease P recognizes two structural modules in a tRNA substrate: the T stem-loop and the acceptor stem. These two modules are connected through a helical linker. The T stem-loop binds at a surface confined in a folding domain away from the active site. Substrates for the Bacillus subtilis RNase P RNA were previously selected in vitro that are shown to bind comparably well or better than a tRNA substrate. Chemical modification of P RNA-substrate complexes with dimethylsulfate and kethoxal was performed to determine how the P RNA recognizes three in vitro selected substrates. All three substrates bind at the surface known to interact with the T stem-loop of tRNA. Similar to a tRNA, the secondary structure of these substrates contains a helix around the cleavage site and a hairpin loop at the corresponding position of the T stem-loop. Unlike a tRNA, these two structural modules are connected through a non-helical linker. The two structural modules in the tRNA and in the selected substrates bind to two different domains in P RNA. The properties of substrate recognition exhibited by this ribozyme may be exploited to isolate new ribozyme-substrate pairs with interactive structural modules.  相似文献   

15.
Single stranded RNA molecules can assume a wide range of tertiary structures beyond the canonical A-form double helix. Certain sequences, termed motifs, are more common than a random distribution would suggest. The existence of such motifs can be rationalized in structural terms. In this study, we have investigated the intrinsic structural stability of RNA terminal loop motifs using multiple MD simulations in explicit water. Representative loops were chosen from the major tetraloop motifs, including also the U-turn motif. Not all loops retain their folded starting structure, but lowering the temperature to 277 K, or adding adjacent base pairs from the stem to which the motif is attached, helps stabilizing the folded loop structure.  相似文献   

16.
It is important to control CRISPR/Cas9 when sufficient editing is obtained. In the current study, rational engineering of guide RNAs (gRNAs) is performed to develop small-molecule-responsive CRISPR/Cas9. For our purpose, the sequence of gRNAs are modified to introduce ligand binding sites based on the rational design of ligand–RNA pairs. Using short target sequences, we demonstrate that the engineered RNA provides an excellent scaffold for binding small molecule ligands. Although the ‘stem–loop 1’ variants of gRNA induced variable cleavage activity for different target sequences, all ‘stem–loop 3’ variants are well tolerated for CRISPR/Cas9. We further demonstrate that this specific ligand–RNA interaction can be utilized for functional control of CRISPR/Cas9 in vitro and in human cells. Moreover, chemogenetic control of gene editing in human cells transfected with all-in-one plasmids encoding Cas9 and designer gRNAs is demonstrated. The strategy may become a general approach for generating switchable RNA or DNA for controlling other biological processes.  相似文献   

17.
Recent works has suggested that proteins in early evolution have gone through a stage of closed loop elements with a typical contour size of 25-35 residues. These closed loops are still the elementary protein units to these days, and can be used to spell out protein sequence/structure relationship through a relatively small number of protein prototypes. In this study we aimed to identify the sequences that are used to lock the loop ends to one another, and to show how an extensive dictionary of such locking pairs can be created using positional correlation data from a large proteome database, and structural data from PDB databases. Such a dictionary can be used in reconstructing the evolutionary pathway the modern proteins have gone through, and in identifying closed loop elements in modern proteins with yet unknown 3D structure.  相似文献   

18.
A mitochondrial aspartate tRNA (anticodon GUC) was isolated from a transplantable rat tumor, Morris hepatoma 5123D, and sequenced. The sequence, pGAGAUAUUm(1)AGUAAAAUAAUUACA psi AACCUUGUCAAGGUUAAGUUAUAGACUUAAAUCUAUAUAUCUUACCAOH, can be arranged in a cloverleaf structure. The RNA exhibits a number of unusual features, such as lack of the constant -G-G- and -T-psi-C- sequences in loops I and IV, respectively, small size of these loops, lack of the constant G.C base pair adjacent to loop IV, predominance of A.U base pairs in general, and presence of m1A in position 9. The RNA exhibits 82 and 70% homology with the DNA-derived putative sequences of human placenta and beef heart mitochondrial tRNA Asp, respectively, and bears little resemblance to other sequenced aspartate tRNAs of non-mitochondrial origin.  相似文献   

19.
We determined the melting temperatures (Tm) and thermodynamic parameters of 15 RNA and 19 DNA hairpins at 1 M NaCl, 0.01 M sodium phosphate, 0.1 mM EDTA, at pH 7. All these hairpins have loops of four bases, the most common loop size in 16S and 23S ribosomal RNAs. The RNA hairpins varied in loop sequence, loop-closing base pair (A.U, C.G, or G.C), base sequence of the stem, and stem size (four or five base pairs). The DNA hairpins varied in loop sequence, loop-closing base pair (C.G, or G.C), and base sequence of the four base-pair stem. Thermodynamic properties of a hairpin may be represented by nearest-neighbor interactions of the stem plus contributions from the loop. Thus, we obtained thermodynamic parameters for the formation of RNA and DNA tetraloops. For the tetraloops we studied, a free energy of loop formation (at 37 degrees C) of about +3 kcal/mol is most common for either RNA or DNA. There are extra stable loops with delta G degrees 37 near +1 kcal/mol, but the sequences are not necessarily the same for RNA and DNA. The closing base pair is also important; changing from C.G to G.C lowered the stability of several tetraloops in both RNA and DNA. These values will be useful in predicting RNA and DNA secondary structures.  相似文献   

20.
Abstract

Recent works has suggested that proteins in early evolution have gone through a stage of closed loop elements with a typical contour size of 25–35 residues. These closed loops are still the elementary protein units to these days, and can be used to spell out protein sequence/structure relationship through a relatively small number of protein prototypes. In this study we aimed to identify the sequences that are used to lock the loop ends to one another, and to show how an extensive dictionary of such locking pairs can be created using positional correlation data from a large proteome database, and structural data from PDB databases. Such a dictionary can be used in reconstructing the evolutionary pathway the modern proteins have gone through, and in identifying closed loop elements in modern proteins with yet unknown 3D structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号