首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The understanding of folding and function of RNA molecules depends on the identification and classification of interactions between ribonucleotide residues. We developed a new method named ClaRNA for computational classification of contacts in RNA 3D structures. Unique features of the program are the ability to identify imperfect contacts and to process coarse-grained models. Each doublet of spatially close ribonucleotide residues in a query structure is compared to clusters of reference doublets obtained by analysis of a large number of experimentally determined RNA structures, and assigned a score that describes its similarity to one or more known types of contacts, including pairing, stacking, base–phosphate and base–ribose interactions. The accuracy of ClaRNA is 0.997 for canonical base pairs, 0.983 for non-canonical pairs and 0.961 for stacking interactions. The generalized squared correlation coefficient (GC2) for ClaRNA is 0.969 for canonical base pairs, 0.638 for non-canonical pairs and 0.824 for stacking interactions. The classifier can be easily extended to include new types of spatial relationships between pairs or larger assemblies of nucleotide residues. ClaRNA is freely available via a web server that includes an extensive set of tools for processing and visualizing structural information about RNA molecules.  相似文献   

2.
RNA is now known to possess various structural, regulatory and enzymatic functions for survival of cellular organisms. Functional RNA structures are generally created by three-dimensional organization of small structural motifs, formed by base pairing between self-complementary sequences from different parts of the RNA chain. In addition to the canonical Watson–Crick or wobble base pairs, several non-canonical base pairs are found to be crucial to the structural organization of RNA molecules. They appear within different structural motifs and are found to stabilize the molecule through long-range intra-molecular interactions between basic structural motifs like double helices and loops. These base pairs also impart functional variation to the minor groove of A-form RNA helices, thus forming anchoring site for metabolites and ligands. Non-canonical base pairs are formed by edge-to-edge hydrogen bonding interactions between the bases. A large number of theoretical studies have been done to detect and analyze these non-canonical base pairs within crystal or NMR derived structures of different functional RNA. Theoretical studies of these isolated base pairs using ab initio quantum chemical methods as well as molecular dynamics simulations of larger fragments have also established that many of these non-canonical base pairs are as stable as the canonical Watson–Crick base pairs. This review focuses on the various structural aspects of non-canonical base pairs in the organization of RNA molecules and the possible applications of these base pairs in predicting RNA structures with more accuracy.  相似文献   

3.
The success of comparative analysis in resolving RNA secondary structure and numerous tertiary interactions relies on the presence of base covariations. Although the majority of base covariations in aligned sequences is associated to Watson-Crick base pairs, many involve non-canonical or restricted base pair exchanges (e.g. only G:C/A:U), reflecting more specific structural constraints. We have developed a computer program that determines potential base pairing conformations for a given set of paired nucleotides in a sequence alignment. This program (ISOPAIR) assumes that the base pair conformation is maintained through sequence variation without significantly affecting the path of the sugar-phosphate backbone. ISOPAIR identifies such 'isomorphic' structures for any set of input base pair or base triple sequences. The program was applied to base pairs and triples with known structures and sequence exchanges. In several instances, isomorphic structures were correctly identified with ISOPAIR. Thus, ISOPAIR is useful when assessing non-canonical base pair conformations in comparative analysis. ISOPAIR applications are limited to those cases where unusual base pair exchanges indeed reflect a non-canonical conformation.  相似文献   

4.
Non-canonical base pairs, mostly present in the RNA, often play a prominent role towards maintaining their structural diversity. Higher order structures like base triples are also important in defining and stabilizing the tertiary folded structure of RNA. We have developed a new program BPFIND to analyze different types of canonical and non-canonical base pairs and base triples involving at least two direct hydrogen bonds formed between polar atoms of the bases or sugar O2' only. We considered 104 possible types of base pairs, out of which examples of 87 base pair types are found to occur in the available RNA crystal structures. Analysis indicates that approximately 32.7% base pairs in the functional RNA structures are non-canonical, which include different types of GA and GU Wobble base pairs apart from a wide range of base pair possibilities. We further noticed that more than 10.4% of these base pairs are involved in triplet formation, most of which play important role in maintaining long-range tertiary contacts in the three-dimensional folded structure of RNA. Apart from detection, the program also gives a quantitative estimate of the conformational deformation of detected base pairs in comparison to an ideal planar base pair. This helps us to gain insight into the extent of their structural variations and thus assists in understanding their specific role towards structural and functional diversity.  相似文献   

5.
A nuclear magnetic resonance (NMR) experiment is described for the direct detection of N-H[...]N hydrogen bonds (H-bonds) in 15N isotope-labeled biomolecules. This quantitative HNN-COSY (correlation spectroscopy) experiment detects and quantifies electron-mediated scalar couplings across the H-bond (H-bond scalar couplings), which connect magnetically active (15)N nuclei of the H-bond donor and acceptor. Detectable H-bonds comprise the imino H-bonds in canonical Watson-Crick base pairs, many H-bonds in unusual nucleic acid base pairs and H-bonds between protein backbone or side-chain N-H donor and N acceptor moieties. Unlike other NMR observables, which provide only indirect evidence of the presence of H-bonds, the H-bond scalar couplings identify all partners of the H-bond, the donor, the donor proton and the acceptor in a single experiment. The size of the scalar couplings can be related to H-bond geometries and as a time average to H-bond dynamics. The time required to detect the H-bonds is typically less than 1 d at millimolar concentrations for samples of molecular weight < or = approximately 25 kDa. A C15N/13C-labeled potato spindle tuber viroid T1 RNA domain is used as an example to illustrate this procedure.  相似文献   

6.
The Biological Magnetic Resonance Data Bank contains NMR chemical shift depositions for 132 RNAs and RNA-containing complexes. We have analyzed the 1H NMR chemical shifts reported for non-exchangeable protons of residues that reside within A-form helical regions of these RNAs. The analysis focused on the central base pair within a stretch of three adjacent base pairs (BP triplets), and included both Watson–Crick (WC; G:C, A:U) and G:U wobble pairs. Chemical shift values were included for all 43 possible WC-BP triplets, as well as 137 additional triplets that contain one or more G:U wobbles. Sequence-dependent chemical shift correlations were identified, including correlations involving terminating base pairs within the triplets and canonical and non-canonical structures adjacent to the BP triplets (i.e. bulges, loops, WC and non-WC BPs), despite the fact that the NMR data were obtained under different conditions of pH, buffer, ionic strength, and temperature. A computer program (RNAShifts) was developed that enables convenient comparison of RNA 1H NMR assignments with database predictions, which should facilitate future signal assignment/validation efforts and enable rapid identification of non-canonical RNA structures and RNA-ligand/protein interaction sites.  相似文献   

7.
RNA is known to be involved in several cellular processes; however, it is only active when it is folded into its correct 3D conformation. The folding, bending and twisting of an RNA molecule is dependent upon the multitude of canonical and non-canonical secondary structure motifs. These motifs contribute to the structural complexity of RNA but also serve important integral biological functions, such as serving as recognition and binding sites for other biomolecules or small ligands. One of the most prevalent types of RNA secondary structure motifs are single mismatches, which occur when two canonical pairs are separated by a single non-canonical pair. To determine sequence–structure relationships and to identify structural patterns, we have systematically located, annotated and compared all available occurrences of the 30 most frequently occurring single mismatch-nearest neighbor sequence combinations found in experimentally determined 3D structures of RNA-containing molecules deposited into the Protein Data Bank. Hydrogen bonding, stacking and interaction of nucleotide edges for the mismatched and nearest neighbor base pairs are described and compared, allowing for the identification of several structural patterns. Such a database and comparison will allow researchers to gain insight into the structural features of unstudied sequences and to quickly look-up studied sequences.  相似文献   

8.
Non-canonical base pairs play important roles in organizing the complex three-dimensional folding of RNA. Here, we outline methodology developed both to analyze the spatial patterns of interacting base pairs in known RNA structures and to reconstruct models from the collective experimental information. We focus attention on the structural context and deformability of the seven pairing patterns found in greatest abundance in the helical segments in a set of well-resolved crystal structures, including (i–ii) the canonical A·U and G·C Watson–Crick base pairs, (iii) the G·U wobble pair, (iv) the sheared G·A pair, (v) the A·U Hoogsteen pair, (vi) the U·U wobble pair, and (vii) the G·A Watson–Crick-like pair. The non-canonical pairs stand out from the canonical associations in terms of apparent deformability, spanning a broader range of conformational states as measured by the six rigid-body parameters used to describe the spatial arrangements of the interacting bases, the root-mean-square deviations of the base-pair atoms, and the fluctuations in hydrogen-bonding geometry. The deformabilties, the modes of base-pair deformation, and the preferred sites of occurrence depend on sequence. We also characterize the positioning and overlap of the base pairs with respect to the base pairs that stack immediately above and below them in double-helical fragments. We incorporate the observed positions of the bases, base pairs, and intervening phosphorus atoms in models to predict the effects of the non-canonical interactions on overall helical structure.  相似文献   

9.
RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa.  相似文献   

10.
11.

Background

The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented.

Results

TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold. TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a significance threshold are shown to be more accurate for TurboFold than for alternative methods that estimate base pairing probabilities. TurboFold-MEA, which uses base pairing probabilities from TurboFold in a maximum expected accuracy algorithm for secondary structure prediction, has accuracy comparable to the best performing secondary structure prediction methods. The computational and memory requirements for TurboFold are modest and, in terms of sequence length and number of sequences, scale much more favorably than joint alignment and folding algorithms.

Conclusions

TurboFold is an iterative probabilistic method for predicting secondary structures for multiple RNA sequences that efficiently and accurately combines the information from the comparative analysis between sequences with the thermodynamic folding model. Unlike most other multi-sequence structure prediction methods, TurboFold does not enforce strict commonality of structures and is therefore useful for predicting structures for homologous sequences that have diverged significantly. TurboFold can be downloaded as part of the RNAstructure package at http://rna.urmc.rochester.edu.  相似文献   

12.
The DNA microarray technology is a well-established and widely used technology although it has several drawbacks. The accurate molecular recognition of the canonical nucleobases of probe and target is the basis for reliable results obtained from microarray hybridization experiments. However, the great flexibility of base pairs within the DNA molecule allows the formation of various secondary structures incorporating Watson-Crick base pairs as well as non-canonical base pair motifs, thus becoming a source of inaccuracy and inconsistence. The first part of this report provides an overview of unusual base pair motifs formed during molecular DNA interaction in solution highlighting selected secondary structures employing non-Watson-Crick base pairs. The same mispairing phenomena obtained in solution are expected to occur for immobilized probe molecules as well as for target oligonucleotides employed in microarray hybridization experiments the effect of base pairing and oligonucleotide composition on hybridization is considered. The incorporation of nucleoside derivatives as close shape mimics of the four canonical nucleosides into the probe and target oligonucleotides is discussed as a chemical tool to resolve unwanted mispairing. The second part focuses non-Watson-Crick base pairing during hybridization performed on microarrays. This is exemplified for the unusual stable dG.dA base pair.  相似文献   

13.
Abstract

The structures of tandem non-canonical base pairs, a frequently recurring motif in RNA molecules, are reviewed and analysed. The tandem non-canonical base pair motifs can be roughly divided in three groups, containing seven subgroups based on their base pairing patterns and local geometries. Structural details and helical parameters that can be used to numerically distinguish between the subgroups are tabulated. Remarkably, while the individual helical twists of the tandem and adjacent base pair steps can be substantially smaller or larger than the typical A-form value of 32.7°, the average value is close to A-form. This and other striking regularities resulting from compensating geometrical adjustments, important for understanding and predicting the configurations of non-canonical base pairs geometries are discussed.  相似文献   

14.
Recent studies have shown that RNA structural motifs play essential roles in RNA folding and interaction with other molecules. Computational identification and analysis of RNA structural motifs remains a challenging task. Existing motif identification methods based on 3D structure may not properly compare motifs with high structural variations. Other structural motif identification methods consider only nested canonical base-pairing structures and cannot be used to identify complex RNA structural motifs that often consist of various non-canonical base pairs due to uncommon hydrogen bond interactions. In this article, we present a novel RNA structural alignment method for RNA structural motif identification, RNAMotifScan, which takes into consideration the isosteric (both canonical and non-canonical) base pairs and multi-pairings in RNA structural motifs. The utility and accuracy of RNAMotifScan is demonstrated by searching for kink-turn, C-loop, sarcin-ricin, reverse kink-turn and E-loop motifs against a 23S rRNA (PDBid: 1S72), which is well characterized for the occurrences of these motifs. Finally, we search these motifs against the RNA structures in the entire Protein Data Bank and the abundances of them are estimated. RNAMotifScan is freely available at our supplementary website (http://genome.ucf.edu/RNAMotifScan).  相似文献   

15.
The structures of tandem non-canonical base pairs, a frequently recurring motif in RNA molecules, are reviewed and analysed. The tandem non-canonical base pair motifs can be roughly divided in three groups, containing seven subgroups based on their base pairing patterns and local geometries. Structural details and helical parameters that can be used to numerically distinguish between the subgroups are tabulated. Remarkably, while the individual helical twists of the tandem and adjacent base pair steps can be substantially smaller or larger than the typical A-form value of 32.7 degrees, the average value is close to A-form. This and other striking regularities resulting from compensating geometrical adjustments, important for understanding and predicting the configurations of non-canonical base pairs geometries are discussed.  相似文献   

16.
A set of 43 337 splice junction pairs was extracted from mammalian GenBank annotated genes. Expressed sequence tag (EST) sequences support 22 489 of them. Of these, 98.71% contain canonical dinucleotides GT and AG for donor and acceptor sites, respectively; 0.56% hold non-canonical GC-AG splice site pairs; and the remaining 0.73% occurs in a lot of small groups (with a maximum size of 0.05%). Studying these groups we observe that many of them contain splicing dinucleotides shifted from the annotated splice junction by one position. After close examination of such cases we present a new classification consisting of only eight observed types of splice site pairs (out of 256 a priori possible combinations). EST alignments allow us to verify the exonic part of the splice sites, but many non-canonical cases may be due to intron sequencing errors. This idea is given substantial support when we compare the sequences of human genes having non-canonical splice sites deposited in GenBank by high throughput genome sequencing projects (HTG). A high proportion (156 out of 171) of the human non-canonical and EST-supported splice site sequences had a clear match in the human HTG. They can be classified after corrections as: 79 GC-AG pairs (of which one was an error that corrected to GC-AG), 61 errors that were corrected to GT-AG canonical pairs, six AT-AC pairs (of which two were errors that corrected to AT-AC), one case was produced from non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two cases left of supported non-canonical splice sites. If we assume that approximately the same situation is true for the whole set of annotated mammalian non-canonical splice sites, then the 99.24% of splice site pairs should be GT-AG, 0.69% GC-AG, 0.05% AT-AC and finally only 0.02% could consist of other types of non-canonical splice sites. We analyze several characteristics of EST-verified splice sites and build weight matrices for the major groups, which can be incorporated into gene prediction programs. We also present a set of EST-verified canonical splice sites larger by two orders of magnitude than the current one (22 199 entries versus approximately 600) and finally, a set of 290 EST-supported non-canonical splice sites. Both sets should be significant for future investigations of the splicing mechanism.  相似文献   

17.
A computer algorithm has been developed which identifies tRNA genes and tRNA-like structures in DNA sequences. The program searches the sequence string for specific base positions that correspond to the invariant and semi-invariant bases found in tRNAs. The tRNA nature of the sequence is confirmed by the presence of complementary base pairing at the tRNA's calculated 5' and 3' ends (which in situ constitutes the amino-acyl stem region). The program achieves greater than 96% accuracy when run against known tRNA sequences in the Genbank database. The program is modular and is readily modified to allow searching either a file or database. The program is written in "C" and operates on a D.E.C. Vax 750. The utility of the algorithm is demonstrated by the identification of a distinctive tRNA structure in an intron of a published bovine hemoglobin gene.  相似文献   

18.
Selenocysteine (Sec) is the 21st amino acid in translation. Sec tRNA (tRNASec) has an anticodon complementary to the UGA codon. We solved the crystal structure of human tRNASec. tRNASec has a 9-bp acceptor stem and a 4-bp T stem, in contrast with the 7-bp acceptor stem and the 5-bp T stem in the canonical tRNAs. The acceptor stem is kinked between the U6:U67 and G7:C66 base pairs, leading to a bent acceptor-T stem helix. tRNASec has a 6-bp D stem and a 4-nt D loop. The long D stem includes unique A14:U21 and G15:C20a pairs. The D-loop:T-loop interactions include the base pairs G18:U55 and U16:U59, and a unique base triple, U20:G19:C56. The extra arm comprises of a 6-bp stem and a 4-nt loop. Remarkably, the D stem and the extra arm do not form tertiary interactions in tRNASec. Instead, tRNASec has an open cavity, in place of the tertiary core of a canonical tRNA. The linker residues, A8 and U9, connecting the acceptor and D stems, are not involved in tertiary base pairing. Instead, U9 is stacked on the first base pair of the extra arm. These features might allow tRNASec to be the target of the Sec synthesis/incorporation machineries.  相似文献   

19.
Sequence variation in a widespread, recurrent, structured RNA 3D motif, the Sarcin/Ricin (S/R), was studied to address three related questions: First, how do the stabilities of structured RNA 3D motifs, composed of non-Watson–Crick (non-WC) basepairs, compare to WC-paired helices of similar length and sequence? Second, what are the effects on the stabilities of such motifs of isosteric and non-isosteric base substitutions in the non-WC pairs? And third, is there selection for particular base combinations in non-WC basepairs, depending on the temperature regime to which an organism adapts? A survey of large and small subunit rRNAs from organisms adapted to different temperatures revealed the presence of systematic sequence variations at many non-WC paired sites of S/R motifs. UV melting analysis and enzymatic digestion assays of oligonucleotides containing the motif suggest that more stable motifs tend to be more rigid. We further found that the base substitutions at non-Watson–Crick pairing sites can significantly affect the thermodynamic stabilities of S/R motifs and these effects are highly context specific indicating the importance of base-stacking and base-phosphate interactions on motif stability. This study highlights the significance of non-canonical base pairs and their contributions to modulating the stability and flexibility of RNA molecules.  相似文献   

20.
It is shown that the recently developed quantitative J(NN)HNN-COSY experiment can be used for the direct identification of hydrogen bonds in non-canonical base pairs in RNA. Scalar(2h)J(NN)couplings across NH.N hydrogen bonds are observed in imino hydrogen bonded GA base pairs of the hpGA RNA molecule, which contains a tandem GA mismatch, and in the reverse Hoogsteen AU base pairs of the E-loop of Escherichia coli 5S rRNA. These scalar couplings correlate the imino donor(15)N nucleus of guanine or uridine with the acceptor N1 or N7 nucleus of adenine. The values of the corresponding(2h)J(NN)coupling constants are similar in size to those observed in Watson-Crick base pairs. The reverse Hoogsteen base pairs could be directly detected for the E-loop of E.coli 5S rRNA both in the free form and in a complex with the ribosomal protein L25. This supports the notion that the E-loop is a pre-folded RNA recognition site that is not subject to significant induced conformational changes. Since Watson-Crick GC and AU base pairs are also readily detected the HNN-COSY experiment provides a useful and sensitive tool for the rapid identification of RNA secondary structure elements.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号