首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Abstract

In this paper, we proposed a 6-D representation of RNA secondary structures. The use of the 6-D representation is illustrated by constructing structure invariants. Comparisons with the similarity/dissimilarity results based on 6-D representation for a set of RNA secondary structures, are considered to illustrate the use of our structure invariants based on the entries in derived sequence matrices restricted to a selected width of a band along the main diagonal.  相似文献   

2.
B. Liao  T. Wang  K. Ding 《Molecular simulation》2013,39(14-15):1063-1071
In this paper, we proposed a seven-dimensional (7D) representation of ribonucleic acid (RNA) secondary structures. The use of the 7D representation is illustrated by constructing structure invariants. Comparisons with the similarity/dissimilarity results based on 7D representation for a set of RNA 3 secondary structures at the 3′-terminus of different viruses, are considered to illustrate the use of our structure invariants based on the entries in derived sequence matrices restricted to a selected width of a band along the main diagonal.  相似文献   

3.
In this paper, we proposed a 3-D graphical representation of RNA secondary structures. Based on this representation, we outline an approach by constructing a 3-component vector whose components are the normalized leading eigenvalues of the L/L matrices associated with RNA secondary structure. The examination of similarities/dissimilarities among the secondary structure at the 3'-terminus of different viruses illustrates the utility of the approach.  相似文献   

4.
Abstract

In this paper, we proposed a 3-D graphical representation of RNA secondary structures. Based on this representation, we outline an approach by constructing a 3-component vector whose components are the normalized leading eigenvalues of the L/L matrices associated with RNA secondary structure. The examination of similarities/dissimilarities among the secondary structure at the 3′-terminus of different viruses illustrates the utility of the approach.  相似文献   

5.
The language of RNA: a formal grammar that includes pseudoknots   总被引:9,自引:0,他引:9  
MOTIVATION: In a previous paper, we presented a polynomial time dynamic programming algorithm for predicting optimal RNA secondary structure including pseudoknots. However, a formal grammatical representation for RNA secondary structure with pseudoknots was still lacking. RESULTS: Here we show a one-to-one correspondence between that algorithm and a formal transformational grammar. This grammar class encompasses the context-free grammars and goes beyond to generate pseudoknotted structures. The pseudoknot grammar avoids the use of general context-sensitive rules by introducing a small number of auxiliary symbols used to reorder the strings generated by an otherwise context-free grammar. This formal representation of the residue correlations in RNA structure is important because it means we can build full probabilistic models of RNA secondary structure, including pseudoknots, and use them to optimally parse sequences in polynomial time.  相似文献   

6.
Scales in RNA, based on geometrical considerations, can be exploited for the analysis and prediction of RNA structures. By using spectral decomposition, geometric information that relates to a given RNA fold can be reduced to a single positive scalar number, the second eigenvalue of the Laplacian matrix corresponding to the tree-graph representation of the RNA secondary structure. Along with the free energy of the structure, being the most important scalar number in the prediction of RNA folding by energy minimization methods, the second eigenvalue of the Laplacian matrix can be used as an effective signature for locating a target folded structure given a set of RNA folds. Furthermore, the second eigenvector of the Laplacian matrix can be used to partition large RNA structures into smaller fragments. An illustrative example is given for the use of the second eigenvalue to predict mutations that may cause structural rearrangements, thereby disrupting stable motifs.  相似文献   

7.

Background  

It has become increasingly apparent that a comprehensive database of RNA motifs is essential in order to achieve new goals in genomic and proteomic research. Secondary RNA structures have frequently been represented by various modeling methods as graph-theoretic trees. Using graph theory as a modeling tool allows the vast resources of graphical invariants to be utilized to numerically identify secondary RNA motifs. The domination number of a graph is a graphical invariant that is sensitive to even a slight change in the structure of a tree. The invariants selected in this study are variations of the domination number of a graph. These graphical invariants are partitioned into two classes, and we define two parameters based on each of these classes. These parameters are calculated for all small order trees and a statistical analysis of the resulting data is conducted to determine if the values of these parameters can be utilized to identify which trees of orders seven and eight are RNA-like in structure.  相似文献   

8.
In general RNA prediction problem includes genetic mapping, physical mapping and structure prediction. The ultimate goal of structure prediction is to obtain the three dimensional structure of bimolecules through computation. The key concept for solving the above mentioned problem is the appropriate representation of the biological structures. Even though, the problems that concern representations of certain biological structures like secondary structures either are characterized as NP-complete or with high complexity, few approximation algorithms and techniques had been constructed, mainly with polynomial complexity, concerning the prediction of RNA secondary structures. In this paper, a new class of Motzkin paths is introduced, the so-called semi-elevated inverse Motzkin peakless paths for the representation of two interacting RNA molecules. The basic combinatorial interpretations on single RNA secondary structures are extended via these new Motzkin paths on two RNA molecules and can be applied to the prediction methods of joint structures formed by interacting RNAs.  相似文献   

9.
In this paper, we propose a nongraphical representation for protein secondary structures. By counting the frequency of occurrence of all possible four-tuples (i.e., four-letter words) of a protein secondary structure sequence, we construct a set of 3x3 matrices for the corresponding protein secondary structure sequence. Furthermore, the leading eigenvalues of these matrices are computed and considered as invariants for the protein secondary structure sequences. To illustrate the utility of our approach, we apply it to a set of real data to distinguish protein structural classes. The result indicates that it can be used to complement the classification of protein secondary structures.  相似文献   

10.
MOTIVATION: To predict the consensus secondary structure, possibly including pseudoknots, of a set of RNA unaligned sequences. RESULTS: We have designed a method based on a new representation of any RNA secondary structure as a set of structural relationships between the helices of the structure. We refer to this representation as a structural pattern. In a first step, we use thermodynamic parameters to select, for each sequence, the best secondary structures according to energy minimization and we represent each of them using its corresponding structural pattern. In a second step, we search for the repeated structural patterns, i.e. the largest structural patterns that occur in at least one sequence, i.e. included in at least one of the structural patterns associated to each sequence. Thanks to an efficient encoding of structural patterns, this search comes down to identifying the largest repeated word suffixes in a dictionary. In a third step, we compute the plausibility of each repeated structural pattern by checking if it occurs more frequently in the studied sequences than in random RNA sequences. We then suppose that the consensus secondary structure corresponds to the repeated structural pattern that displays the highest plausibility. We present several experiments concerning tRNA, fragments of 16S rRNA and 10Sa RNA (including pseudoknots); in each of them, we found the putative consensus secondary structure.  相似文献   

11.
A number of non-coding RNA are known to contain functionally important or conserved pseudoknots. However, pseudoknotted structures are more complex than orthodox, and most methods for analyzing secondary structures do not handle them. I present here a way to decompose and represent general secondary structures which extends the tree representation of the stem-loop structure, and use this to analyze the frequency of pseudoknots in known and in random secondary structures. This comparison shows that, though a number of pseudoknots exist, they are still relatively rare and mostly of the simpler kinds. In contrast, random secondary structures tend to be heavily knotted, and the number of available structures increases dramatically when allowing pseudoknots. Therefore, methods for structure prediction and non-coding RNA identification that allow pseudoknots are likely to be much less powerful than those that do not, unless they penalize pseudoknots appropriately.  相似文献   

12.
We propose a novel representation of RNA secondary structure for a quick comparison of different structures. Secondary structure was viewed as a set of stems and each stem was represented by two values according to its position. Using this representation, we improved the comparative sequence analysis method results and the minimum free-energy model. In the comparative sequence analysis method, a novel algorithm independent of multiple sequence alignment was developed to improve performance. When dealing with a single-RNA sequence, the minimum free-energy model is improved by combining it with RNA class information. Secondary structure prediction experiments were done on tRNA and RNAse P RNA; sensitivity and specificity were both improved. Furthermore, software programs were developed for non-commercial use.  相似文献   

13.
Abstract

In this paper, we propose a nongraphical representation for protein secondary structures. By counting the frequency of occurrence of all possible four-tuples (i.e., four-letter words) of a protein secondary structure sequence, we construct a set of 3 × 3 matrices for the corresponding protein secondary structure sequence. Furthermore, the leading eigenvalues of these matrices are computed and considered as invariants for the protein secondary structure sequences. To illustrate the utility of our approach, we apply it to a set of real data to distinguish protein structural classes. The result indicates that it can be used to complement the classification of protein secondary structures.  相似文献   

14.
We present a novel topological classification of RNA secondary structures with pseudoknots. It is based on the topological genus of the circular diagram associated to the RNA base-pair structure. The genus is a positive integer number whose value quantifies the topological complexity of the folded RNA structure. In such a representation, planar diagrams correspond to pure RNA secondary structures and have zero genus, whereas non-planar diagrams correspond to pseudoknotted structures and have higher genus. The topological genus allows for the definition of topological folding motifs, similar in spirit to those introduced and commonly used in protein folding. We analyze real RNA structures from the databases Worldwide Protein Data Bank and Pseudobase and classify them according to their topological genus. For simplicity, we limit our analysis by considering only Watson-Crick complementary base pairs and G-U wobble base pairs. We compare the results of our statistical survey with existing theoretical and numerical models. We also discuss possible applications of this classification and show how it can be used for identifying new RNA structural motifs.  相似文献   

15.
Many noncoding RNAs (ncRNAs) function through both their sequences and secondary structures. Thus, secondary structure derivation is an important issue in today's RNA research. The state-of-the-art structure annotation tools are based on comparative analysis, which derives consensus structure of homologous ncRNAs. Despite promising results from existing ncRNA aligning and consensus structure derivation tools, there is a need for more efficient and accurate ncRNA secondary structure modeling and alignment methods. In this work, we introduce a consensus structure derivation approach based on grammar string, a novel ncRNA secondary structure representation that encodes an ncRNA's sequence and secondary structure in the parameter space of a context-free grammar (CFG) and a full RNA grammar including pseudoknots. Being a string defined on a special alphabet constructed from a grammar, grammar string converts ncRNA alignment into sequence alignment. We derive consensus secondary structures from hundreds of ncRNA families from BraliBase 2.1 and 25 families containing pseudoknots using grammar string alignment. Our experiments have shown that grammar string-based structure derivation competes favorably in consensus structure quality with Murlet and RNASampler. Source code and experimental data are available at http://www.cse.msu.edu/~yannisun/grammar-string.  相似文献   

16.
MOTIVATION: Several algorithms have been developed for drawing RNA secondary structures, however none of these can be used to draw RNA pseudoknot structures. In the sense of graph theory, a drawing of RNA secondary structures is a tree, whereas a drawing of RNA pseudoknots is a graph with inner cycles within a pseudoknot as well as possible outer cycles formed between a pseudoknot and other structural elements. Thus, RNA pseudoknots are more difficult to visualize than RNA secondary structures. Since no automatic method for drawing RNA pseudoknots exists, visualizing RNA pseudoknots relies on significant amount of manual work and does not yield satisfactory results. The task of visualizing RNA pseudoknots by hand becomes more challenging as the size and complexity of the RNA pseudoknots increase. RESULTS: We have developed a new representation and an algorithm for drawing H-type pseudoknots with RNA secondary structures. Compared to existing representations of H-type pseudoknots, the new representation ensures uniform and clear drawings with no edge crossing for any H-type pseudoknots. To the best of our knowledge, this is the first algorithm for automatically drawing RNA pseudoknots with RNA secondary structures. The algorithm has been implemented in a Java program, which can be executed on any computing system. Experimental results demonstrate that the algorithm generates an aesthetically pleasing drawing of all H-type pseudoknots. The results have also shown that the drawing has high readability, enabling the user to quickly and easily recognize the whole RNA structure as well as the pseudoknots themselves.  相似文献   

17.
 Magarshak et al. represented an RNA molecule as a complex vector and an RNA secondary structure Γ as a complex matrix S Γ in such a way that the molecule represented by was compatible with the secondary structure Γ if and only if . They only considered Watson-Crick base pairs and their representation cannot be extended to allow for GU pairs. In this paper we study a generalization of Magarshak's representation that allows for these pairs, and in particular we provide a family of algebraic structures where that generalization can be carried out. We also show that this representation can be used to compare secondary structures, through transfer matrices which transform the representation of one secondary structure into the representation of the other. Received: 10 December 2001 / Revised version: 7 May 2002 / Published online: 28 February 2003 Key words or phrases: RNA secondary structure – Algebra – Finite field  相似文献   

18.
The increasing importance of non-coding RNA in biology and medicine has led to a growing interest in the problem of RNA 3-D structure prediction. As is the case for proteins, RNA 3-D structure prediction methods require two key ingredients: an accurate energy function and a conformational sampling procedure. Both are only partly solved problems. Here, we focus on the problem of conformational sampling. The current state of the art solution is based on fragment assembly methods, which construct plausible conformations by stringing together short fragments obtained from experimental structures. However, the discrete nature of the fragments necessitates the use of carefully tuned, unphysical energy functions, and their non-probabilistic nature impairs unbiased sampling. We offer a solution to the sampling problem that removes these important limitations: a probabilistic model of RNA structure that allows efficient sampling of RNA conformations in continuous space, and with associated probabilities. We show that the model captures several key features of RNA structure, such as its rotameric nature and the distribution of the helix lengths. Furthermore, the model readily generates native-like 3-D conformations for 9 out of 10 test structures, solely using coarse-grained base-pairing information. In conclusion, the method provides a theoretical and practical solution for a major bottleneck on the way to routine prediction and simulation of RNA structure and dynamics in atomic detail.  相似文献   

19.
We have developed a deductive database system PACADE for analyzing3-D and secondary structures of protein. The PA CADE systemconsists of a relational database created from Protein DataBank and a deductive engine DEE based on logic programming.It has the following features: (1) The system has an inferencemechanism. This means by which users can easily write and checkbiological hypotheses using logical and declarative rules insteadof procedural programs. (2) The relational database of the PACADE system stores data on bath 3-D and secondwy structuresof protein. The integration of this two level structure makesfeasible an abstract representation of the protein structure.We describe herein the design, functions, and implementationof this PACADE system.  相似文献   

20.
RNA secondary structures are important in many biological processes and efficient structure prediction can give vital directions for experimental investigations. Many available programs for RNA secondary structure prediction only use a single sequence at a time. This may be sufficient in some applications, but often it is possible to obtain related RNA sequences with conserved secondary structure. These should be included in structural analyses to give improved results. This work presents a practical way of predicting RNA secondary structure that is especially useful when related sequences can be obtained. The method improves a previous algorithm based on an explicit evolutionary model and a probabilistic model of structures. Predictions can be done on a web server at http://www.daimi.au.dk/~compbio/pfold.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号