首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
RNA molecules, which are found in all living cells, fold into characteristic structures that account for their diverse functional activities. Many of these RNA structures consist of a collection of fundamental RNA motifs. The various combinations of RNA basic components form different RNA classes and define their unique structural and functional properties. The availability of many genome sequences makes it possible to search computationally for functional RNAs. Biological experiments indicate that functional RNAs have characteristic RNA structural motifs represented by specific combinations of base pairings and conserved nucleotides in the loop regions. The searching for those well-ordered RNA structures and their homologues in genomic sequences is very helpful for the understanding of RNA-based gene regulation. In this paper, we consider the following problem: given an RNA sequence with a known secondary structure, efficiently determine candidate segments in genomic sequences that can potentially form RNA secondary structures similar to the given RNA secondary structure. Our new bottom-up approach searches all potential stem-loops similar to ones of the given RNA secondary structure first, and then based on located stem-loops, detects potential homologous structural RNAs in genomic sequences.  相似文献   

2.
U2 RNA shares a structural domain with U1, U4, and U5 RNAs.   总被引:49,自引:9,他引:40       下载免费PDF全文
C Branlant  A Krol  J P Ebel  E Lazar  B Haendler    M Jacob 《The EMBO journal》1982,1(10):1259-1265
We previously reported common structural features within the 3'-terminal regions of U1, U4, and U5 RNAs. To check whether these features also exist in U2 RNA, the primary and secondary structures of the 3'-terminal regions of chicken, pheasant, and rat U2 RNAs were examined. Whereas no difference was observed between pheasant and chicken, the chicken and rat sequences were only 82.5% homologous. Such divergence allowed us to propose a unique model of secondary structure based on maximum base-pairing and secondary structure conservation. The same model was obtained from the results of limited digestion of U2 RNA with various nucleases. Comparison of this structure with those of U1, U4, and U5 RNAs shows that the four RNAs share a common structure designated as domain A, and consisting of a free single-stranded region with the sequence Pu-A-(U)n-G-Pup flanked by two hairpins. The hairpin on the 3' side is very stable and has the sequence Py-N-Py-Gp in the loop. The presence of this common domain is discussed in connection with relationships among U RNAs and common protein binding sites.  相似文献   

3.
MOTIVATION: Since the whole genome sequences of many species have been determined, computational prediction of RNA secondary structures and computational identification of those non-coding RNA regions by comparative genomics become important. Therefore, more advanced alignment methods are required. Recently, an approach of structural alignment for RNA sequences has been introduced to solve these problems. Pair hidden Markov models on tree structures (PHMMTSs) proposed by Sakakibara are efficient automata-theoretic models for structural alignment of RNA secondary structures, although PHMMTSs are incapable of handling pseudoknots. On the other hand, tree adjoining grammars (TAGs), a subclass of context-sensitive grammars, are suitable for modeling pseudoknots. Our goal is to extend PHMMTSs by incorporating TAGs to be able to handle pseudoknots. RESULTS: We propose pair stochastic TAGs (PSTAGs) for aligning and predicting RNA secondary structures including a simple type of pseudoknot which can represent most known pseudoknot structures. First, we extend PHMMTSs defined on alignment of 'trees' to PSTAGs defined on alignment of 'TAG trees' which represent derivation processes of TAGs and are functionally equivalent to derived trees of TAGs. Then, we develop an efficient dynamic programming algorithm of PSTAGs for obtaining an optimal structural alignment including pseudoknots. We implement the PSTAG algorithm and demonstrate the properties of the algorithm by using it to align and predict several small pseudoknot structures. We believe that our implemented program based on PSTAGs is the first grammar-based and practically executable software for comparative analyses of RNA pseudoknot structures, and, further, non-coding RNAs.  相似文献   

4.
Pb(2+)-catalyzed cleavage of RNA has been shown previously to be a useful probe for tertiary structure. In the present study, Pb2+ cleavage patterns were identified for ribonuclease P RNAs from three phylogenetically disparate organisms, Escherichia coli, Chromatium vinosum, Bacillus subtilis, and for E. coli RNase P RNAs that had been altered by deletions. Each of the native RNAs undergoes cleavage at several sites in the core structure that is common to all bacterial RNase P RNAs. All the cleavages occur in non-paired regions of the secondary structure models of the RNAs, in regions likely to be involved in tertiary interactions. Two cleavage sites occur at homologous positions in all the native RNAs, regardless of sequence variation, suggesting common tertiary structural features. The Pb2+ cleavage sites in four deletion mutants of E. coli RNase P RNA differed from the native pattern, indicating alterations in the tertiary structures of the mutant RNAs. This conclusion is consistent with previously characterized properties of the mutant RNAs. The Pb2+ cleavage assay is thus a useful probe to reveal alteration of tertiary structure in RNase P RNA.  相似文献   

5.
A new approach is proposed for determining common RNA secondary structures within a set of homologous RNAs. The approach is a combination of phylogenetic and thermodynamic methods which is based on the prediction of optimal and suboptimal secondary structures, topological similarity searches and phylogenetic comparative analysis. The optimal and suboptimal RNA secondary structures are predicted by energy minimization. Structural comparison of the predicted RNA secondary structures is used to find conserved structures that are topologically similar in all these homologous RNAs. The validity of the conserved structural elements found is then checked by phylogenetic comparison of the sequences. This procedure is used to predict common structures of ribonuclease P (RNAase P) RNAs.  相似文献   

6.
We propose an ab initio method, named DiscoverR, for finding common patterns from two RNA secondary structures. The method works by representing RNA secondary structures as ordered labeled trees and performs tree pattern discovery using an efficient dynamic programming algorithm. DiscoverR is able to identify and extract the largest common substructures from two RNA molecules having different sizes without prior knowledge of the locations and topologies of these substructures. We also extend DiscoverR to find repeated regions in an RNA secondary structure, and apply this extended method to detect structural repeats in the 3'-untranslated region of a protein kinase gene. We describe the biological significance of a repeated hairpin found by our method, demonstrating the usefulness of the method. DiscoverR is implemented in Java; a jar file including the source code of the program is available for download at http://bioinformatics.njit.edu/DiscoverR.  相似文献   

7.
Li W  Liu Z  Lai L 《Biopolymers》1999,49(6):481-495
A general problem in comparative modeling and protein design is the conformational evaluation of loops with a certain sequence in specific environmental protein frameworks. Loops of different sequences and structures on similar scaffolds are common in the Protein Data Bank (PDB). In order to explore both structural and sequential diversity of them, a data base of loops connecting similar secondary structure fragments is constructed by searching the data base of families of structurally similar proteins and PDB. A total of 84 loop families having 2-13 residues are found among the well-determined structures of resolution better than 2.5 A. Eight alpha-alpha, 20 alpha-beta, 19 beta-alpha, and 37 beta-beta families are identified. Every family contains more than 5 loop motifs. In each family, no loops share same sequence and all the frameworks are well superimposed. Forty-three new loop classes are distinguished in the data base. The structural variability of loops in homologous proteins are examined and shown in 44 families. Motif families are characterized with geometric parameters and sequence patterns. The conformations of loops in each family are clustered into subfamilies using average linkage cluster analysis method. Information such as geometric properties, sequence profile, sequential and structural variability in loop, structural alignment parameters, sequence similarities, and clustering results are provided. Correlations between the conformation of loops and loop sequence, motif sequence, and global sequence of PDB chain are examined in order to find how loop structures depend on their sequences and how they are affected by the local and global environment. Strong correlations (R > 0.75) are only found in 24 families. The best R value is 0.98. The data base is available through the Internet.  相似文献   

8.
The function of many RNAs depends crucially on their structure. Therefore, the design of RNA molecules with specific structural properties has many potential applications, e.g. in the context of investigating the function of biological RNAs, of creating new ribozymes, or of designing artificial RNA nanostructures. Here, we present a new algorithm for solving the following RNA secondary structure design problem: given a secondary structure, find an RNA sequence (if any) that is predicted to fold to that structure. Unlike the (pseudoknot-free) secondary structure prediction problem, this problem appears to be hard computationally. Our new algorithm, "RNA Secondary Structure Designer (RNA-SSD)", is based on stochastic local search, a prominent general approach for solving hard combinatorial problems. A thorough empirical evaluation on computationally predicted structures of biological sequences and artificially generated RNA structures as well as on empirically modelled structures from the biological literature shows that RNA-SSD substantially out-performs the best known algorithm for this problem, RNAinverse from the Vienna RNA Package. In particular, the new algorithm is able to solve structures, consistently, for which RNAinverse is unable to find solutions. The RNA-SSD software is publically available under the name of RNA Designer at the RNASoft website (www.rnasoft.ca).  相似文献   

9.
Metabolic networks of many cellular organisms share global statistical features. Their connectivity distributions follow the long-tailed power law and show the small-world property. In addition, their modular structures are organized in a hierarchical manner. Although the global topological organization of metabolic networks is well understood, their local structural organization is still not clear. Investigating local properties of metabolic networks is necessary to understand the nature of metabolism in living organisms. To identify the local structural organization of metabolic networks, we analysed the subgraphs of metabolic networks of 43 organisms from three domains of life. We first identified the network motifs of metabolic networks and identified the statistically significant subgraph patterns. We then compared metabolic networks from different domains and found that they have similar local structures and that the local structure of each metabolic network has its own taxonomical meaning. Organisms closer in taxonomy showed similar local structures. In addition, the common substrates of 43 metabolic networks were not randomly distributed, but were more likely to be constituents of cohesive subgraph patterns.  相似文献   

10.
Functional RNA regions are often related to recurrent secondary structure patterns (or motifs), which can exert their role in several different ways, particularly in dictating the interaction with RNA-binding proteins, and acting in the regulation of a large number of cellular processes. Among the available motif-finding tools, the majority focuses on sequence patterns, sometimes including secondary structure as additional constraints to improve their performance. Nonetheless, secondary structures motifs may be concurrent to their sequence counterparts or even encode a stronger functional signal. Current methods for searching structural motifs generally require long pipelines and/or high computational efforts or previously aligned sequences. Here, we present BEAM (BEAr Motif finder), a novel method for structural motif discovery from a set of unaligned RNAs, taking advantage of a recently developed encoding for RNA secondary structure named BEAR (Brand nEw Alphabet for RNAs) and of evolutionary substitution rates of secondary structure elements. Tested in a varied set of scenarios, from small- to large-scale, BEAM is successful in retrieving structural motifs even in highly noisy data sets, such as those that can arise in CLIP-Seq or other high-throughput experiments.  相似文献   

11.
MOTIVATION: Proteins of the same class often share a secondary structure packing arrangement but differ in how the secondary structure units are ordered in the sequence. We find that proteins that share a common core also share local sequence-structure similarities, and these can be exploited to align structures with different topologies. In this study, segments from a library of local sequence-structure alignments were assembled hierarchically, enforcing the compactness and conserved inter-residue contacts but not sequential ordering. Previous structure-based alignment methods often ignore sequence similarity, local structural equivalence and compactness. RESULTS: The new program, SCALI (Structural Core ALIgnment), can efficiently find conserved packing arrangements, even if they are non-sequentially ordered in space. SCALI alignments conserve remote sequence similarity and contain fewer alignment errors. Clustering of our pairwise non-sequential alignments shows that recurrent packing arrangements exist in topologically different structures. For example, the three-layer sandwich domain architecture may be divided into four structural subclasses based on internal packing arrangements. These subclasses represent an intermediate level of structure classification, more general than topology, but more specific than architecture as defined in CATH. A strategy is presented for developing a set of predictive hidden Markov models based on multiple SCALI alignments.  相似文献   

12.
MOTIVATION: Due to the importance of considering secondary structures in aligning functional RNAs, several pairwise sequence-structure alignment methods have been developed. They use extended alignment scores that evaluate secondary structure information in addition to sequence information. However, two problems for the multiple alignment step remain. First, how to combine pairwise sequence-structure alignments into a multiple alignment and second, how to generate secondary structure information for sequences whose explicit structural information is missing. RESULTS: We describe a novel approach for multiple alignment of RNAs (MARNA) taking into consideration both the primary and the secondary structures. It is based on pairwise sequence-structure comparisons of RNAs. From these sequence-structure alignments, libraries of weighted alignment edges are generated. The weights reflect the sequential and structural conservation. For sequences whose secondary structures are missing, the libraries are generated by sampling low energy conformations. The libraries are then processed by the T-Coffee system, which is a consistency based multiple alignment method. Furthermore, we are able to extract a consensus-sequence and -structure from a multiple alignment. We have successfully tested MARNA on several datasets taken from the Rfam database.  相似文献   

13.
Prediction of RNA-RNA interaction is a key to elucidating possible functions of small non-coding RNAs, and a number of computational methods have been proposed to analyze interacting RNA secondary structures. In this article, we focus on predicting binding sites of target RNAs that are expected to interact with regulatory antisense RNAs in a general form of interaction. For this purpose, we propose bistaRNA, a novel method for predicting multiple binding sites of target RNAs. bistaRNA employs binding profiles that represent scores for hybridized structures, leading to reducing the computational cost for interaction prediction. bistaRNA considers an ensemble of equilibrium interacting structures and seeks to maximize expected accuracy using dynamic programming. Experimental results on real interaction data validate good accuracy and fast computation time of bistaRNA as compared with several competitive methods. Moreover, we aim to find new targets given specific antisense RNAs, which provides interesting insights into antisense RNA regulation. bistaRNA is implemented in C++. The program and Supplementary Material are available at http://rna.naist.jp/program/bistarna/.  相似文献   

14.
We present a novel method for structural comparison of protein structures. The approach consists of two main phases: 1) an initial search phase where, starting from aligned pairs of secondary structure elements, the space of 3D transformations is searched for similarities and 2) a subsequent refinement phase where interim solutions are subjected to parallel, local, iterative dynamic programming in the areas of possible improvement. The proposed method combines dynamic programming for finding alignments but does not restrict solutions to be sequential. In addition, to deal with the problem of nonuniqueness of optimal similarities, we introduce a consensus scoring method in selecting the preferred similarity and provide a list of top-ranked solutions. The method, called FASE (flexible alignment of secondary structure elements), was tested on well-known data and various standard problems from the literature. The results show that FASE is able to find remote and weak similarities consistently using a reasonable run time. The method was tested (using the SCOP database) on its ability to discriminate interfold pairs from intrafold pairs at the level of the best existing methods. The method was then applied to the problem of finding circular permutations in proteins.  相似文献   

15.
The detection of Outer Membrane Proteins (OMP) in whole genomes is an actual question, their sequence characteristics have thus been intensively studied. This class of protein displays a common beta-barrel architecture, formed by adjacent antiparallel strands. However, due to the lack of available structures, few structural studies have been made on this class of proteins. Here we propose a novel OMP local structure investigation, based on a structural alphabet approach, i.e., the decomposition of 3D structures using a library of four-residue protein fragments. The optimal decomposition of structures using hidden Markov model results in a specific structural alphabet of 20 fragments, six of them dedicated to the decomposition of beta-strands. This optimal alphabet, called SA20-OMP, is analyzed in details, in terms of local structures and transitions between fragments. It highlights a particular and strong organization of beta-strands as series of regular canonical structural fragments. The comparison with alphabets learned on globular structures indicates that the internal organization of OMP structures is more constrained than in globular structures. The analysis of OMP structures using SA20-OMP reveals some recurrent structural patterns. The preferred location of fragments in the distinct regions of the membrane is investigated. The study of pairwise specificity of fragments reveals that some contacts between structural fragments in beta-sheets are clearly favored whereas others are avoided. This contact specificity is stronger in OMP than in globular structures. Moreover, SA20-OMP also captured sequential information. This can be integrated in a scoring function for structural model ranking with very promising results.  相似文献   

16.
Atomic force microscopy analysis of icosahedral virus RNA   总被引:6,自引:0,他引:6  
Single-stranded genomic RNAs from four icosahedral viruses (poliovirus, turnip yellow mosaic virus (TYMV), brome mosaic virus (BMV), and satellite tobacco mosaic virus (STMV)) along with the RNA from the helical tobacco mosaic virus (TMV) were extracted using phenol/chloroform. The RNAs were imaged using atomic force microscopy (AFM) under dynamic conditions in which the RNA was observed to unfold. RNAs from the four icosahedral viruses initially exhibited highly condensed, uniform spherical shapes with diameters consistent with those expected from the interiors of their respective capsids. Upon incubation at 26 degrees C, poliovirus RNA gradually transformed into chains of globular domains having the appearance of thick, irregularly segmented fibers. These ultimately unwound further to reveal segmented portions of the fibers connected by single strands of RNA of 0.5-1 nm thickness. Virtually the same transformations were shown by TYMV and BMV RNA, and with heating, the RNA from STMV. Upon cooling, the chains of domains of poliovirus RNA and STMV RNA condensed and re-formed their original spherical shapes. TMV RNAs initially appeared as single-stranded threads of 0.5-1.0 nm diameter but took on the structure of the multidomain chains upon further incubation at room temperature. These ultimately condensed into short, thick chains of larger domains. Our observations suggest that classical extraction of RNA from icosahedral virions produces little effect on overall conformation. As tertiary structure is lost however, it is evident that secondary structural elements are arranged in a sequential, linear fashion along the polynucleotide chain. At least in the case of poliovirus and STMV, the process of tertiary structure re-formation from the linear chain of secondary structural domains proceeds in the absence of protein. RNA base sequence, therefore, may be sufficient to encode the conformation of the encapsidated RNA even in the absence of coat proteins.  相似文献   

17.
18.
MOTIVATION: The functions of non-coding RNAs are strongly related to their secondary structures, but it is known that a secondary structure prediction of a single sequence is not reliable. Therefore, we have to collect similar RNA sequences with a common secondary structure for the analyses of a new non-coding RNA without knowing the exact secondary structure itself. Therefore, the sequence comparison in searching similar RNAs should consider not only their sequence similarities but also their potential secondary structures. Sankoff's algorithm predicts the common secondary structures of the sequences, but it is computationally too expensive to apply to large-scale analyses. Because we often want to compare a large number of cDNA sequences or to search similar RNAs in the whole genome sequences, much faster algorithms are required. RESULTS: We propose a new method of comparing RNA sequences based on the structural alignments of the fixed-length fragments of the stem candidates. The implemented software, SCARNA (Stem Candidate Aligner for RNAs), is fast enough to apply to the long sequences in the large-scale analyses. The accuracy of the alignments is better or comparable with the much slower existing algorithms. AVAILABILITY: The web server of SCARNA with graphical structural alignment viewer is available at http://www.scarna.org/.  相似文献   

19.
The complete nucleotide sequence of the major species of cytoplasmic 5S ribosomal RNA of Euglena gracilis has been determined. The sequence is: 5' GGCGUACGGCCAUACUACCGGGAAUACACCUGAACCCGUUCGAUUUCAGAAGUUAAGCCUGGUCAGGCCCAGUUAGUAC UGAGGUGGGCGACCACUUGGGAACACUGGGUGCUGUACGCUUOH3'. This sequence can be fitted to the secondary structural models recently proposed for eukaryotic 5S ribosomal RNAs (1,2). Several properties of the Euglena 5S RNA reveal a close phylogenetic relationship between this organism and the protozoa. Large stretches of nucleotide sequences in predominantly single-stranded regions of the RNA are homologous to that of the trypanosomatid protozoan Crithidia fasticulata. There is less homology when compared to the RNAs of the green alga Chlorella or to the RNAs of the higher plants. The sequence AGAAC near position 40 that is common to plant 5S RNAs is CGAUU in both Euglena and Crithidia. The Euglena 5S RNA has secondary structural features at positions 79-99 similar to that of the protozoa and different from that of the plants. The conclusions drawn from comparative studies of cytochrome c structures which indicate a close phylogenetic relatedness between Euglena and the trypanosomatid protozoa are supported by the comparative data with 5S ribosomal RNAs.  相似文献   

20.
Elements of local tertiary structure in RNA molecules are important in understanding structure-function relationships. The loop E motif, first identified in several eukaryotic RNAs at functional sites which share an exceptional propensity for UV crosslinking between specific bases, was subsequently shown to have a characteristic tertiary structure. Common sequences and secondary structures have allowed other examples of the E-loop motif to be recognized in a number of RNAs at sites of protein binding or other biological function. We would like to know if more elements of local tertiary structure, in addition to the E-loop, can be identified by such common features. The highly structured circular RNA genome of the hepatitis D virus (HDV) provides an ideal test molecule because it has extensive internal structure, a UV-crosslinkable tertiary element, and specific sites for functional interactions with proteins including host PKR. We have now found a UV-crosslinkable element of local tertiary structure in antigenomic HDV RNA which, although differing from the E-loop, has a very similar pattern of sequence and secondary structure to the UV-crosslinkable element found in the genomic strand. Despite the fact that the two structures map close to one another, the sequences comprising them are not the templates for each other. Instead, the template regions for each element are additional sites for potential higher order structure on their respective complementary strands. This wealth of recurring sequences interspersed with base-paired stems provides a context to examine other RNA species for such features and their correlations with biological function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号