首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
RNAMotif, an RNA secondary structure definition and search algorithm   总被引:26,自引:7,他引:19       下载免费PDF全文
RNA molecules fold into characteristic secondary and tertiary structures that account for their diverse functional activities. Many of these RNA structures are assembled from a collection of RNA structural motifs. These basic building blocks are used repeatedly, and in various combinations, to form different RNA types and define their unique structural and functional properties. Identification of recurring RNA structural motifs will therefore enhance our understanding of RNA structure and help associate elements of RNA structure with functional and regulatory elements. Our goal was to develop a computer program that can describe an RNA structural element of any complexity and then search any nucleotide sequence database, including the complete prokaryotic and eukaryotic genomes, for these structural elements. Here we describe in detail a new computational motif search algorithm, RNAMotif, and demonstrate its utility with some motif search examples. RNAMotif differs from other motif search tools in two important aspects: first, the structure definition language is more flexible and can specify any type of base–base interaction; second, RNAMotif provides a user controlled scoring section that can be used to add capabilities that patterns alone cannot provide.  相似文献   

2.
3.
New methods are described for finding recurrent three-dimensional (3D) motifs in RNA atomic-resolution structures. Recurrent RNA 3D motifs are sets of RNA nucleotides with similar spatial arrangements. They can be local or composite. Local motifs comprise nucleotides that occur in the same hairpin or internal loop. Composite motifs comprise nucleotides belonging to three or more different RNA strand segments or molecules. We use a base-centered approach to construct efficient, yet exhaustive search procedures using geometric, symbolic, or mixed representations of RNA structure that we implement in a suite of MATLAB programs, “Find RNA 3D” (FR3D). The first modules of FR3D preprocess structure files to classify base-pair and -stacking interactions. Each base is represented geometrically by the position of its glycosidic nitrogen in 3D space and by the rotation matrix that describes its orientation with respect to a common frame. Base-pairing and base-stacking interactions are calculated from the base geometries and are represented symbolically according to the Leontis/Westhof basepairing classification, extended to include base-stacking. These data are stored and used to organize motif searches. For geometric searches, the user supplies the 3D structure of a query motif which FR3D uses to find and score geometrically similar candidate motifs, without regard to the sequential position of their nucleotides in the RNA chain or the identity of their bases. To score and rank candidate motifs, FR3D calculates a geometric discrepancy by rigidly rotating candidates to align optimally with the query motif and then comparing the relative orientations of the corresponding bases in the query and candidate motifs. Given the growing size of the RNA structure database, it is impossible to explicitly compute the discrepancy for all conceivable candidate motifs, even for motifs with less than ten nucleotides. The screening algorithm that we describe finds all candidate motifs whose geometric discrepancy with respect to the query motif falls below a user-specified cutoff discrepancy. This technique can be applied to RMSD searches. Candidate motifs identified geometrically may be further screened symbolically to identify those that contain particular basepair types or base-stacking arrangements or that conform to sequence continuity or nucleotide identity constraints. Purely symbolic searches for motifs containing user-defined sequence, continuity and interaction constraints have also been implemented. We demonstrate that FR3D finds all occurrences, both local and composite and with nucleotide substitutions, of sarcin/ricin and kink-turn motifs in the 23S and 5S ribosomal RNA 3D structures of the H. marismortui 50S ribosomal subunit and assigns the lowest discrepancy scores to bona fide examples of these motifs. The search algorithms have been optimized for speed to allow users to search the non-redundant RNA 3D structure database on a personal computer in a matter of minutes.  相似文献   

4.
The distribution of RNA motifs in natural sequences.   总被引:5,自引:3,他引:2       下载免费PDF全文
Functional analysis of genome sequences has largely ignored RNA genes and their structures. We introduce here the notion of 'ribonomics' to describe the search for the distribution of and eventually the determination of the physiological roles of these RNA structures found in the sequence databases. The utility of this approach is illustrated here by the identification in the GenBank database of RNA motifs having known binding or chemical activity. The frequency of these motifs indicates that most have originated from evolutionary drift and are selectively neutral. On the other hand, their distribution among species and their location within genes suggest that the destiny of these motifs may be more elaborate. For example, the hammerhead motif has a skewed organismal presence, is phylogenetically stable and recent work on a schistosome version confirms its in vivo biological activity. The under-representation of the valine-binding motif and the Rev-binding element in GenBank hints at a detrimental effect on cell growth or viability. Data on the presence and the location of these motifs may provide critical guidance in the design of experiments directed towards the understanding and the manipulation of RNA complexes and activities in vivo.  相似文献   

5.
Although artificial RNA motifs that can functionally replace the GNRA/receptor interaction, a class of RNA–RNA interacting motifs, were isolated from RNA libraries and used to generate designer RNA structures, receptors for non-GNRA tetraloops have not been found in nature or selected from RNA libraries. In this study, we report successful isolation of a receptor motif interacting with GAAC, a non-GNRA tetraloop, from randomized sequences embedded in a catalytic RNA. Biochemical characterization of the GAAC/receptor interacting motif within three structural contexts showed its binding affinity, selectivity and structural autonomy. The motif has binding affinity comparable with that of a GNRA/receptor, selectivity orthogonal to GNRA/receptors and structural autonomy even in a large RNA context. These features would be advantageous for usage of the motif as a building block for designer RNAs. The isolated motif can also be used as a query sequence to search for unidentified naturally occurring GANC receptor motifs.  相似文献   

6.
7.
RNA molecules, which are found in all living cells, fold into characteristic structures that account for their diverse functional activities. Many of these RNA structures consist of a collection of fundamental RNA motifs. The various combinations of RNA basic components form different RNA classes and define their unique structural and functional properties. The availability of many genome sequences makes it possible to search computationally for functional RNAs. Biological experiments indicate that functional RNAs have characteristic RNA structural motifs represented by specific combinations of base pairings and conserved nucleotides in the loop regions. The searching for those well-ordered RNA structures and their homologues in genomic sequences is very helpful for the understanding of RNA-based gene regulation. In this paper, we consider the following problem: given an RNA sequence with a known secondary structure, efficiently determine candidate segments in genomic sequences that can potentially form RNA secondary structures similar to the given RNA secondary structure. Our new bottom-up approach searches all potential stem-loops similar to ones of the given RNA secondary structure first, and then based on located stem-loops, detects potential homologous structural RNAs in genomic sequences.  相似文献   

8.
To address many challenges in RNA structure/function prediction, the characterization of RNA''s modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally described in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.  相似文献   

9.
RNA structural motifs are recurrent three-dimensional (3D) components found in the RNA architecture. These RNA structural motifs play important structural or functional roles and usually exhibit highly conserved 3D geometries and base-interaction patterns. Analysis of the RNA 3D structures and elucidation of their molecular functions heavily rely on efficient and accurate identification of these motifs. However, efficient RNA structural motif search tools are lacking due to the high complexity of these motifs. In this work, we present RNAMotifScanX, a motif search tool based on a base-interaction graph alignment algorithm. This novel algorithm enables automatic identification of both partially and fully matched motif instances. RNAMotifScanX considers noncanonical base-pairing interactions, base-stacking interactions, and sequence conservation of the motifs, which leads to significantly improved sensitivity and specificity as compared with other state-of-the-art search tools. RNAMotifScanX also adopts a carefully designed branch-and-bound technique, which enables ultra-fast search of large kink-turn motifs against a 23S rRNA. The software package RNAMotifScanX is implemented using GNU C++, and is freely available from http://genome.ucf.edu/RNAMotifScanX.  相似文献   

10.
11.
Traditional sequence-based search methods such as BLAST and FASTA can be used to identify sequence similarities. Recently, there is a growing interest in performing RNA shape similarity searches inside selected genes to locate RNA structure motifs that are known to possess functionally important roles. For example, in the newly discovered RNA genetic control elements called "riboswitches", the box domain is known to be highly conserved among various bacterial species in both its nucleotide composition and shape. However, in non-bacterial species, shape conservation is likely to become more important than sequence conservation when searching for riboswitch patterns. For this purpose, we present an approach tailored for detecting RNA shape similarities. We extend the Structure to String (ST R2) method that was initially proposed to locate shape similarities in proteins to identify predicted secondary structures of RNAs. The ST R2 for RNAs is a translation of a secondary structure to a string of characters, after which known sequence-based search algorithms with an efficient implementation are being used. We validate that the ST R2 succeeds to locate G-box riboswitches in prokaryotes, as expected. Subsequently we show running examples when attempting to detect G-box riboswitch candidates in eukaryotes.  相似文献   

12.
A detailed knowledge of the mapping between sequence and structure spaces in populations of RNA molecules is essential to better understand their present-day functional properties, to envisage a plausible early evolution of RNA in a prebiotic chemical environment and to improve the design of in vitro evolution experiments, among others. Analysis of natural RNAs, as well as in vitro and computational studies, show that certain RNA structural motifs are much more abundant than others, pointing out a complex relation between sequence and structure. Within this framework, we have investigated computationally the structural properties of a large pool (108 molecules) of single-stranded, 35 nt-long, random RNA sequences. The secondary structures obtained are ranked and classified into structure families. The number of structures in main families is analytically calculated and compared with the numerical results. This permits a quantification of the fraction of structure space covered by a large pool of sequences. We further show that the number of structural motifs and their frequency is highly unbalanced with respect to the nucleotide composition: simple structures such as stem-loops and hairpins arise from sequences depleted in G, while more complex structures require an enrichment of G. In general, we observe a strong correlation between subfamilies—characterized by a fixed number of paired nucleotides—and nucleotide composition. Our results are compared to the structural repertoire obtained in a second pool where isolated base pairs are prohibited.  相似文献   

13.
Detection of common motifs in RNA secondary structures.   总被引:2,自引:2,他引:0       下载免费PDF全文
We describe a novel computerized system for comparison of RNA secondary structures and demonstrate its use for experimental studies. The system is able to screen a very large number of structures, to cluster similar structures and to detect specific structural motifs. In particular, the system is useful for detecting mutations with specific structural effects among all possible point mutations, and for predicting compensatory mutations that will restore the wild type structure. The algorithms are independent of the folding rules that are used to generate the secondary structures.  相似文献   

14.
Understanding the structural repertoire of RNA is crucial for RNA genomics research. Yet current methods for finding novel RNAs are limited to small or known RNA families. To expand known RNA structural motifs, we develop a two-dimensional graphical representation approach for describing and estimating the size of RNA’s secondary structural repertoire, including naturally occurring and other possible RNA motifs. We employ tree graphs to describe RNA tree motifs and more general (dual) graphs to describe both RNA tree and pseudoknot motifs. Our estimates of RNA’s structural space are vastly smaller than the nucleotide sequence space, suggesting a new avenue for finding novel RNAs. Specifically our survey shows that known RNA trees and pseudoknots represent only a small subset of all possible motifs, implying that some of the ‘missing’ motifs may represent novel RNAs. To help pinpoint RNA-like motifs, we show that the motifs of existing functional RNAs are clustered in a narrow range of topological characteristics. We also illustrate the applications of our approach to the design of novel RNAs and automated comparison of RNA structures; we report several occurrences of RNA motifs within larger RNAs. Thus, our graph theory approach to RNA structures has implications for RNA genomics, structure analysis and design.  相似文献   

15.
RNA is now known to possess various structural, regulatory and enzymatic functions for survival of cellular organisms. Functional RNA structures are generally created by three-dimensional organization of small structural motifs, formed by base pairing between self-complementary sequences from different parts of the RNA chain. In addition to the canonical Watson–Crick or wobble base pairs, several non-canonical base pairs are found to be crucial to the structural organization of RNA molecules. They appear within different structural motifs and are found to stabilize the molecule through long-range intra-molecular interactions between basic structural motifs like double helices and loops. These base pairs also impart functional variation to the minor groove of A-form RNA helices, thus forming anchoring site for metabolites and ligands. Non-canonical base pairs are formed by edge-to-edge hydrogen bonding interactions between the bases. A large number of theoretical studies have been done to detect and analyze these non-canonical base pairs within crystal or NMR derived structures of different functional RNA. Theoretical studies of these isolated base pairs using ab initio quantum chemical methods as well as molecular dynamics simulations of larger fragments have also established that many of these non-canonical base pairs are as stable as the canonical Watson–Crick base pairs. This review focuses on the various structural aspects of non-canonical base pairs in the organization of RNA molecules and the possible applications of these base pairs in predicting RNA structures with more accuracy.  相似文献   

16.
Given the wealth of new RNA structures and the growing list of RNA functions in biology, it is of great interest to understand the repertoire of RNA folding motifs. The ability to identify new and known motifs within novel RNA structures, to compare tertiary structures with one another and to quantify the characteristics of a given RNA motif are major goals in the field of RNA research; however, there are few systematic ways to address these issues. Using a novel approach for visualizing and mathematically describing macromolecular structures, we have developed a means to quantitatively describe RNA molecules in order to rapidly analyze, compare and explore their features. This approach builds on the alternative eta,theta convention for describing RNA torsion angles and is executed using a new program called PRIMOS. Applying this methodology, we have successfully identified major regions of conformational change in the 50S and 30S ribosomal subunits, we have developed a means to search the database of RNA structures for the prevalence of known motifs and we have classified and identified new motifs. These applications illustrate the powerful capabilities of our new RNA structural convention, and they suggest future adaptations with important implications for bioinformatics and structural genomics.  相似文献   

17.
18.
Structural 3D motifs in RNA play an important role in the RNA stability and function. Previous studies have focused on the characterization and discovery of 3D motifs in RNA secondary and tertiary structures. However, statistical analyses of the distribution of 3D motifs along the RNA appear to be lacking. Herein, we present a novel strategy for evaluating the distribution of 3D motifs along the RNA chain and those motifs whose distributions are significantly non-random are identified. By applying it to the X-ray structure of the large ribosomal subunit from Haloarcula marismortui, helical motifs were found to cluster together along the chain and in the 3D structure, whereas the known tetraloops tend to be sequentially and spatially dispersed. That the distribution of key structural motifs such as tetraloops differ significantly from a random one suggests that our method could also be used to detect novel 3D motifs of any size in sufficiently long/large RNA structures. The motif distribution type can help in the prediction and design of 3D structures of large RNA molecules.  相似文献   

19.
A combination of algorithms to search RNA sequence for the potential for secondary structure formation, and search large numbers of sequences for structural similarity, were used to search the 5'UTRs of annotated genes in the Escherichia coli genome for regulatory RNA structures. Using this approach, similar RNA structures that regulate genes in the thiamin metabolic pathway were identified. In addition, several putative regulatory structures were discovered upstream of genes involved in other metabolic pathways including glycerol metabolism and ethanol fermentation. The results demonstrate that this computational approach is a powerful tool for discovery of important RNA structures within prokaryotic organisms.  相似文献   

20.
RNA binding proteins recognize RNA targets in a sequence specific manner. Apart from the sequence, the secondary structure context of the binding site also affects the binding affinity. Binding sites are often located in single-stranded RNA regions and it was shown that the sequestration of a binding motif in a double-strand abolishes protein binding. Thus, it is desirable to include knowledge about RNA secondary structures when searching for the binding motif of a protein. We present the approach MEMERIS for searching sequence motifs in a set of RNA sequences and simultaneously integrating information about secondary structures. To abstract from specific structural elements, we precompute position-specific values measuring the single-strandedness of all substrings of an RNA sequence. These values are used as prior knowledge about the motif starts to guide the motif search. Extensive tests with artificial and biological data demonstrate that MEMERIS is able to identify motifs in single-stranded regions even if a stronger motif located in double-strand parts exists. The discovered motif occurrences in biological datasets mostly coincide with known protein-binding sites. This algorithm can be used for finding the binding motif of single-stranded RNA-binding proteins in SELEX or other biological sequence data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号