首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent studies have shown that RNA structural motifs play essential roles in RNA folding and interaction with other molecules. Computational identification and analysis of RNA structural motifs remains a challenging task. Existing motif identification methods based on 3D structure may not properly compare motifs with high structural variations. Other structural motif identification methods consider only nested canonical base-pairing structures and cannot be used to identify complex RNA structural motifs that often consist of various non-canonical base pairs due to uncommon hydrogen bond interactions. In this article, we present a novel RNA structural alignment method for RNA structural motif identification, RNAMotifScan, which takes into consideration the isosteric (both canonical and non-canonical) base pairs and multi-pairings in RNA structural motifs. The utility and accuracy of RNAMotifScan is demonstrated by searching for kink-turn, C-loop, sarcin-ricin, reverse kink-turn and E-loop motifs against a 23S rRNA (PDBid: 1S72), which is well characterized for the occurrences of these motifs. Finally, we search these motifs against the RNA structures in the entire Protein Data Bank and the abundances of them are estimated. RNAMotifScan is freely available at our supplementary website (http://genome.ucf.edu/RNAMotifScan).  相似文献   

2.
The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access.  相似文献   

3.
RNA tertiary motifs play an important role in RNA folding and biochemical functions. To help interpret the complex organization of RNA tertiary interactions, we comprehensively analyze a data set of 54 high-resolution RNA crystal structures for motif occurrence and correlations. Specifically, we search seven recognized categories of RNA tertiary motifs (coaxial helix, A-minor, ribose zipper, pseudoknot, kissing hairpin, tRNA D-loop/T-loop, and tetraloop-tetraloop receptor) by various computer programs. For the nonredundant RNA data set, we find 613 RNA tertiary interactions, most of which occur in the 16S and 23S rRNAs. An analysis of these motifs reveals the diversity and variety of A-minor motif interactions and the various possible loop-loop receptor interactions that expand upon the tetraloop-tetraloop receptor. Correlations between motifs, such as pseudoknot or coaxial helix with A-minor, reveal higher-order patterns. These findings may ultimately help define tertiary structure restraints for RNA tertiary structure prediction. A complete annotation of the RNA diagrams for our data set is available at http://www.biomath.nyu.edu/motifs/.  相似文献   

4.
Computational methods such as sequence alignment and motif construction are useful in grouping related proteins into families, as well as helping to annotate new proteins of unknown function. These methods identify conserved amino acids in protein sequences, but cannot determine the specific functional or structural roles of conserved amino acids without additional study. In this work, we present 3MATRIX (http://3matrix.stanford.edu) and 3MOTIF (http://3motif.stanford.edu), a web-based sequence motif visualization system that displays sequence motif information in its appropriate three-dimensional (3D) context. This system is flexible in that users can enter sequences, keywords, structures or sequence motifs to generate visualizations. In 3MOTIF, users can search using discrete sequence motifs such as PROSITE patterns, eMOTIFs, or any other regular expression-like motif. Similarly, 3MATRIX accepts an eMATRIX position-specific scoring matrix, or will convert a multiple sequence alignment block into an eMATRIX for visualization. Each query motif is used to search the protein structure database for matches, in which the motif is then visually highlighted in three dimensions. Important properties of motifs such as sequence conservation and solvent accessible surface area are also displayed in the visualizations, using carefully chosen color shading schemes.  相似文献   

5.
6.
7.
8.
We developed PeptiSite, a comprehensive and reliable database of biologically and structurally characterized peptide-binding sites, in which each site is represented by an ensemble of its complexes with protein, peptide and small molecule partners. The unique features of the database include: (1) the ensemble site representation that provides a fourth dimension to the otherwise three dimensional data, (2) comprehensive characterization of the binding site architecture that may consist of a multimeric protein assembly with cofactors and metal ions and (3) analysis of consensus interaction motifs within the ensembles and identification of conserved determinants of these interactions. Currently the database contains 585 proteins with 650 peptide-binding sites. http://peptisite.ucsd.edu/ link allows searching for the sites of interest and interactive visualization of the ensembles using the ActiveICM web-browser plugin. This structural database for protein–peptide interactions enables understanding of structural principles of these interactions and may assist the development of an efficient peptide docking benchmark.  相似文献   

9.
Linking similar proteins structurally is a challenging task that may help in finding the novel members of a protein family. In this respect, identification of conserved sequence can facilitate understanding and classifying the exact role of proteins. However, the exact role of these conserved elements cannot be elucidated without structural and physiochemical information. In this work, we present a novel desktop application MotViz designed for searching and analyzing the conserved sequence segments within protein structure. With MotViz, the user can extract a complete list of sequence motifs from loaded 3D structures, annotate the motifs structurally and analyze their physiochemical properties. The conservation value calculated for an individual motif can be visualized graphically. To check the efficiency, predicted motifs from the data sets of 9 protein families were analyzed and MotViz algorithm was more efficient in comparison to other online motif prediction tools. Furthermore, a database was also integrated for storing, retrieving and performing the detailed functional annotation studies. In summary, MotViz effectively predicts motifs with high sensitivity and simultaneously visualizes them into 3D strucures. Moreover, MotViz is user-friendly with optimized graphical parameters and better processing speed due to the inclusion of a database at the back end. MotViz is available at http://www.fi-pk.com/motviz.html.  相似文献   

10.
We present the development of a web server, a protein short motif search tool that allows users to simultaneously search for a protein sequence motif and its secondary structure assignments. The web server is able to query very short motifs searches against PDB structural data from the RCSB Protein Databank, with the users defining the type of secondary structures of the amino acids in the sequence motif. The output utilises 3D visualisation ability that highlights the position of the motif in the structure and on the corresponding sequence. Researchers can easily observe the locations and conformation of multiple motifs among the results. Protein short motif search also has an application programming interface (API) for interfacing with other bioinformatics tools. AVAILABILITY: The database is available for free at http://birg3.fbb.utm.my/proteinsms.  相似文献   

11.
DAtA: database of Arabidopsis thaliana annotation   总被引:1,自引:0,他引:1       下载免费PDF全文
The Database of Arabidopsis thaliana Annotation (D At A) was created to enable easy access to and analysis of all the Arabidopsis genome project annotation. The database was constructed using the completed A.thaliana genomic sequence data currently in GenBank. An automated annotation process was used to predict coding sequences for GenBank records that do not include annotation. D At A also contains protein motifs and protein similarities derived from searches of the proteins in D At A with motif databases and the non-redundant protein database. The database is routinely updated to include new GenBank submissions for Arabidopsis genomic sequences and new Blast and protein motif search results. A web interface to D At A allows coding sequences to be searched by name, comment, blast similarity or motif field. In addition, browse options present lists of either all the protein names or identified motifs present in the sequenced A.thaliana genome. The database can be accessed at http://baggage. stanford.edu/group/arabprotein/  相似文献   

12.

Background

Identification of DNA/Protein motifs is a crucial problem for biologists. Computational techniques could be of great help in this identification. In this direction, many computational models for motifs have been proposed in the literature.

Methods

One such important model is the motif model. In this paper we describe a motif search web tool that predominantly employs this motif model. This web tool exploits the state-of-the art algorithms for solving the motif search problem.

Results

The online tool has been helping scientists identify many unknown motifs. Many of our predictions have been successfully verified as well. We hope that this paper will expose this crucial tool to many more scientists.

Availability and requirements

Project name: PMS - Panoptic Motif Search Tool. Project home page: http://pms.engr.uconn.edu or http://motifsearch.com. Licence: PMS tools will be readily available to any scientist wishing to use it for non-commercial purposes, without restrictions. The online tool is freely available without login.  相似文献   

13.
A multitude of motif-finding tools have been published, which can generally be assigned to one of three classes: expectation-maximization, Gibbs-sampling or enumeration. Irrespective of this grouping, most motif detection tools only take into account similarities across ungapped sequence regions, possibly causing short motifs located peripherally and in varying distance to a 'core' motif to be missed. We present a new method, adding to the set of expectation-maximization approaches, that permits the use of gapped alignments for motif elucidation. Availability: The program is available for download from: http://bioinfoserver.rsbs.anu.edu.au/downloads/mclip.jar. Supplementary information: http://bioinfoserver.rsbs.anu.edu.au/utils/mclip/info.php.  相似文献   

14.
RNAMotif, an RNA secondary structure definition and search algorithm   总被引:26,自引:7,他引:19       下载免费PDF全文
RNA molecules fold into characteristic secondary and tertiary structures that account for their diverse functional activities. Many of these RNA structures are assembled from a collection of RNA structural motifs. These basic building blocks are used repeatedly, and in various combinations, to form different RNA types and define their unique structural and functional properties. Identification of recurring RNA structural motifs will therefore enhance our understanding of RNA structure and help associate elements of RNA structure with functional and regulatory elements. Our goal was to develop a computer program that can describe an RNA structural element of any complexity and then search any nucleotide sequence database, including the complete prokaryotic and eukaryotic genomes, for these structural elements. Here we describe in detail a new computational motif search algorithm, RNAMotif, and demonstrate its utility with some motif search examples. RNAMotif differs from other motif search tools in two important aspects: first, the structure definition language is more flexible and can specify any type of base–base interaction; second, RNAMotif provides a user controlled scoring section that can be used to add capabilities that patterns alone cannot provide.  相似文献   

15.
MOTIVATION: We designed a general computational kernel for classification problems that require specific motif extraction and search from sequences. Instead of searching for explicit motifs, our approach finds the distribution of implicit motifs and uses as a feature for classification. Implicit motif distribution approach may be used as modus operandi for bioinformatics problems that require specific motif extraction and search, which is otherwise computationally prohibitive. RESULTS: A system named P2SL that infer protein subcellular targeting was developed through this computational kernel. Targeting-signal was modeled by the distribution of subsequence occurrences (implicit motifs) using self-organizing maps. The boundaries among the classes were then determined with a set of support vector machines. P2SL hybrid computational system achieved approximately 81% of prediction accuracy rate over ER targeted, cytosolic, mitochondrial and nuclear protein localization classes. P2SL additionally offers the distribution potential of proteins among localization classes, which is particularly important for proteins, shuttle between nucleus and cytosol. AVAILABILITY: http://staff.vbi.vt.edu/volkan/p2sl and http://www.i-cancer.fen.bilkent.edu.tr/p2sl CONTACT: rengul@bilkent.edu.tr.  相似文献   

16.
The EMOTIF database is a collection of more than 170 000 highly specific and sensitive protein sequence motifs representing conserved biochemical properties and biological functions. These protein motifs are derived from 7697 sequence alignments in the BLOCKS+ database (released on June 23, 2000) and all 8244 protein sequence alignments in the PRINTS database (version 27.0) using the emotif-maker algorithm developed by Nevill-Manning et al. (Nevill-Manning,C.G., Wu,T.D. and Brutlag,D.L. (1998) Proc. Natl Acad. Sci. USA, 95, 5865-5871; Nevill-Manning,C.G., Sethi,K.S., Wu,T. D. and Brutlag,D.L. (1997) ISMB-97, 5, 202-209). Since the amino acids and the groups of amino acids in these sequence motifs represent critical positions conserved in evolution, search algorithms employing the EMOTIF patterns can identify and classify more widely divergent sequences than methods based on global sequence similarity. The emotif protein pattern database is available at http://motif.stanford.edu/emotif/.  相似文献   

17.
MOTIVATION: RNA structure motifs contained in mRNAs have been found to play important roles in regulating gene expression. However, identification of novel RNA regulatory motifs using computational methods has not been widely explored. Effective tools for predicting novel RNA regulatory motifs based on genomic sequences are needed. RESULTS: We present a new method for predicting common RNA secondary structure motifs in a set of functionally or evolutionarily related RNA sequences. This method is based on comparison of stems (palindromic helices) between sequences and is implemented by applying graph-theoretical approaches. It first finds all possible stable stems in each sequence and compares stems pairwise between sequences by some defined features to find stems conserved across any two sequences. Then by applying a maximum clique finding algorithm, it finds all significant stems conserved across at least k sequences. Finally, it assembles in topological order all possible compatible conserved stems shared by at least k sequences and reports a number of the best assembled stem sets as the best candidate common structure motifs. This method does not require prior structural alignment of the sequences and is able to detect pseudoknot structures. We have tested this approach on some RNA sequences with known secondary structures, in which it is capable of detecting the real structures completely or partially correctly and outperforms other existing programs for similar purposes. AVAILABILITY: The algorithm has been implemented in C++ in a program called comRNA, which is available at http://ural.wustl.edu/softwares.html  相似文献   

18.
Abstract

The existence and identity of non-Watson-Crick base pairs (bps) within RNA bulges, internal loops, and hairpin loops cannot reliably be predicted by existing algorithms. We have developed the Isfold (Isosteric Folding) program as a tool to examine patterns of nucleotide substitutions from sequence alignments or mutation experiments and identify plausible bp interactions. We infer these interactions based on the observation that each non-Watson-Crick bp has a signature pattern of isosteric substitutions where mutations can be made that preserve the 3D structure. Isfold produces a dynamic representation of predicted bps within defined motifs in order of their probabilities. The software was developed under Windows XP, and is capable of running on PC and MAC with Matlab 7.1 (SP3) or higher. A PC standalone version that does not require Matlab also is available. This software and a user manual are freely available at www.ucsf.edu/frankel/isfold.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号