首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
An  J.  Wako  H.  Sarai  A. 《Molecular Biology》2001,35(6):905-910
An amino acid sequence pattern conserved among a family of proteins is called motif. It is usually related to the specific function of the family. On the other hand, functions of proteins are realized through their 3D structures. Specific local structures, called structural motifs, are considered as related to their functions. However, searching for common structural motifs in different proteins is much more difficult than for common sequence motifs. We are attempting in this study to convert the information about the structural motifs into a set of one-dimensional digital strings, i.e., a set of codes, to compare them more easily by computer and to investigate their relationship to functions more quantitatively. By applying the Delaunay tessellation to a 3D structure of a protein, we can assign each local structure to a unique code that is defined so as to reflect its structural feature. Since a structural motif is defined as a set of the local structures in this paper, the structural motif is represented by a set of the codes. In order to examine the ability of the set of the codes to distinguish differences among the sets of local structures with a given PROSITE pattern that contain both true and false positives, we clustered them by introducing a similarity measure among the set of the codes. The obtained clustering shows a good agreement with other results by direct structural comparison methods such as a superposition method. The structural motifs in homologous proteins are also properly clustered according to their sources. These results suggest that the structural motifs can be well characterized by these sets of the codes, and that the method can be utilized in comparing structural motifs and relating them with function.  相似文献   

2.
Many proteins function by interacting with other small molecules (ligands). Identification of ligand‐binding sites (LBS) in proteins can therefore help to infer their molecular functions. A comprehensive comparison among local structures of LBSs was previously performed, in order to understand their relationships and to classify their structural motifs. However, similar exhaustive comparison among local surfaces of LBSs (patches) has never been performed, due to computational complexity. To enhance our understanding of LBSs, it is worth performing such comparisons among patches and classifying them based on similarities of their surface configurations and electrostatic potentials. In this study, we first developed a rapid method to compare two patches. We then clustered patches corresponding to the same PDB chemical component identifier for a ligand, and selected a representative patch from each cluster. We subsequently exhaustively as compared the representative patches and clustered them using similarity score, PatSim. Finally, the resultant PatSim scores were compared with similarities of atomic structures of the LBSs and those of the ligand‐binding protein sequences and functions. Consequently, we classified the patches into ~2000 well‐characterized clusters. We found that about 63% of these clusters are used in identical protein folds, although about 25% of the clusters are conserved in distantly related proteins and even in proteins with cross‐fold similarity. Furthermore, we showed that patches with higher PatSim score have potential to be involved in similar biological processes.  相似文献   

3.
The Rossmann-like fold is the most prevalent and diversified doubly-wound superfold of ancient evolutionary origin. Rossmann-like domains are present in a variety of metabolic enzymes and are capable of binding diverse ligands. Discerning evolutionary relationships among these domains is challenging because of their diverse functions and ancient origin. We defined a minimal Rossmann-like structural motif (RLM), identified RLM-containing domains among known 3D structures (20%) and classified them according to their homologous relationships. New classifications were incorporated into our Evolutionary Classification of protein Domains (ECOD) database. We defined 156 homology groups (H-groups), which were further clustered into 123 possible homology groups (X-groups). Our analysis revealed that RLM-containing proteins constitute approximately 15% of the human proteome. We found that disease-causing mutations are more frequent within RLM domains than within non-RLM domains of these proteins, highlighting the importance of RLM-containing proteins for human health.  相似文献   

4.
Li W  Liu Z  Lai L 《Biopolymers》1999,49(6):481-495
A general problem in comparative modeling and protein design is the conformational evaluation of loops with a certain sequence in specific environmental protein frameworks. Loops of different sequences and structures on similar scaffolds are common in the Protein Data Bank (PDB). In order to explore both structural and sequential diversity of them, a data base of loops connecting similar secondary structure fragments is constructed by searching the data base of families of structurally similar proteins and PDB. A total of 84 loop families having 2-13 residues are found among the well-determined structures of resolution better than 2.5 A. Eight alpha-alpha, 20 alpha-beta, 19 beta-alpha, and 37 beta-beta families are identified. Every family contains more than 5 loop motifs. In each family, no loops share same sequence and all the frameworks are well superimposed. Forty-three new loop classes are distinguished in the data base. The structural variability of loops in homologous proteins are examined and shown in 44 families. Motif families are characterized with geometric parameters and sequence patterns. The conformations of loops in each family are clustered into subfamilies using average linkage cluster analysis method. Information such as geometric properties, sequence profile, sequential and structural variability in loop, structural alignment parameters, sequence similarities, and clustering results are provided. Correlations between the conformation of loops and loop sequence, motif sequence, and global sequence of PDB chain are examined in order to find how loop structures depend on their sequences and how they are affected by the local and global environment. Strong correlations (R > 0.75) are only found in 24 families. The best R value is 0.98. The data base is available through the Internet.  相似文献   

5.
Hou Y  Hsu W  Lee ML  Bystroff C 《Proteins》2004,57(3):518-530
Remote homology detection refers to the detection of structural homology in proteins when there is little or no sequence similarity. In this article, we present a remote homolog detection method called SVM-HMMSTR that overcomes the reliance on detectable sequence similarity by transforming the sequences into strings of hidden Markov states that represent local folding motif patterns. These state strings are transformed into fixed-dimension feature vectors for input to a support vector machine. Two sets of features are defined: an order-independent feature set that captures the amino acid and local structure composition; and an order-dependent feature set that captures the sequential ordering of the local structures. Tests using the Structural Classification of Proteins (SCOP) 1.53 data set show that the SVM-HMMSTR gives a significant improvement over several current methods.  相似文献   

6.
Discovery of local packing motifs in protein structures   总被引:1,自引:0,他引:1  
We present a language for describing structural patterns of residues in protein structures and a method for the discovery of such patterns that recur in a set of protein structures. The patterns impose restrictions on the spatial position of each residue, their order along the amino acid chain, and which amino acids are allowed in each position. Unlike other methods for comparing sets of protein structures, our method is not based on the use of pairwise structure comparisons which is often time consuming and can produce inconsistent results. Instead, the method simultaneously takes into account information from all structures in the search for conserved structure patterns which are potential structure motifs. The method is based on describing the spatial neighborhoods of each residue in each structure as a string and applying a sequence pattern discovery method to find patterns common to subsets of these strings. Finally it is checked whether the similarities between the neighborhood strings correspond to spatially similar substructures. We apply the method to analyze sets of very disparate proteins from the four different protein families: serine proteases, cuprodoxins, cysteine proteinases, and ferredoxins. The motifs found by the method correspond well to the site and motif information given in the annotation of these proteins in PDB, Swiss-Prot, and PROSITE. Furthermore, the motifs are confirmed by using the motif data to constrain the structural alignment of the proteins obtained with the program SAP. This gave the best superposition/alignment of the proteins given the motif assignment.  相似文献   

7.
Jin MS  Lee JO 《BMB reports》2008,41(5):353-357
LRR family proteins play important roles in a variety of physiological processes. To facilitate their production and crystallization, we have invented a novel method termed "Hybrid LRR Technique". Using this technique, the first crystal structures of three TLR family proteins could be determined. In this review, design principles and application of the technique to protein crystallization will be summarized. For crystallization of TLRs, hagfish VLR receptors were chosen as the fusion partners and the TLR and the VLR fragments were fused at the conserved LxxLxLxxN motif to minimize local structural incompatibility. TLR-VLR hybridization did not disturb structures and functions of the target TLR proteins. The Hybrid LRR Technique is a general technique that can be applied to structural studies of other LRR proteins. It may also have broader application in biochemical and medical application of LRR proteins by modifying them without compromising their structural integrity.  相似文献   

8.
Identifying common local segments, also called motifs, in multiple protein sequences plays an important role for establishing homology between proteins. Homology is easy to establish when sequences are similar (sharing an identity > 25%). However, for distant proteins, it is much more difficult to align motifs that are not similar in sequences but still share common structures or functions. This paper is a first attempt to align multiple protein sequences using both primary and secondary structure information. A new sequence model is proposed so that the model assigns high probabilities not only to motifs that contain conserved amino acids but also to motifs that present common secondary structures. The proposed method is tested in a structural alignment database BAliBASE. We show that information brought by the predicted secondary structures greatly improves motif identification. A website of this program is available at www.stat.purdue.edu/~junxie/2ndmodel/sov.html.  相似文献   

9.
Hu YJ 《Nucleic acids research》2002,30(17):3886-3893
Given a set of homologous or functionally related RNA sequences, the consensus motifs may represent the binding sites of RNA regulatory proteins. Unlike DNA motifs, RNA motifs are more conserved in structures than in sequences. Knowing the structural motifs can help us gain a deeper insight of the regulation activities. There have been various studies of RNA secondary structure prediction, but most of them are not focused on finding motifs from sets of functionally related sequences. Although recent research shows some new approaches to RNA motif finding, they are limited to finding relatively simple structures, e.g. stem-loops. In this paper, we propose a novel genetic programming approach to RNA secondary structure prediction. It is capable of finding more complex structures than stem-loops. To demonstrate the performance of our new approach as well as to keep the consistency of our comparative study, we first tested it on the same data sets previously used to verify the current prediction systems. To show the flexibility of our new approach, we also tested it on a data set that contains pseudoknot motifs which most current systems cannot identify. A web-based user interface of the prediction system is set up at http://bioinfo. cis.nctu.edu.tw/service/gprm/.  相似文献   

10.
Kinjo AR  Nakamura H 《PloS one》2012,7(2):e31437
Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures.  相似文献   

11.
Linking similar proteins structurally is a challenging task that may help in finding the novel members of a protein family. In this respect, identification of conserved sequence can facilitate understanding and classifying the exact role of proteins. However, the exact role of these conserved elements cannot be elucidated without structural and physiochemical information. In this work, we present a novel desktop application MotViz designed for searching and analyzing the conserved sequence segments within protein structure. With MotViz, the user can extract a complete list of sequence motifs from loaded 3D structures, annotate the motifs structurally and analyze their physiochemical properties. The conservation value calculated for an individual motif can be visualized graphically. To check the efficiency, predicted motifs from the data sets of 9 protein families were analyzed and MotViz algorithm was more efficient in comparison to other online motif prediction tools. Furthermore, a database was also integrated for storing, retrieving and performing the detailed functional annotation studies. In summary, MotViz effectively predicts motifs with high sensitivity and simultaneously visualizes them into 3D strucures. Moreover, MotViz is user-friendly with optimized graphical parameters and better processing speed due to the inclusion of a database at the back end. MotViz is available at http://www.fi-pk.com/motviz.html.  相似文献   

12.
13.
Functional RNA regions are often related to recurrent secondary structure patterns (or motifs), which can exert their role in several different ways, particularly in dictating the interaction with RNA-binding proteins, and acting in the regulation of a large number of cellular processes. Among the available motif-finding tools, the majority focuses on sequence patterns, sometimes including secondary structure as additional constraints to improve their performance. Nonetheless, secondary structures motifs may be concurrent to their sequence counterparts or even encode a stronger functional signal. Current methods for searching structural motifs generally require long pipelines and/or high computational efforts or previously aligned sequences. Here, we present BEAM (BEAr Motif finder), a novel method for structural motif discovery from a set of unaligned RNAs, taking advantage of a recently developed encoding for RNA secondary structure named BEAR (Brand nEw Alphabet for RNAs) and of evolutionary substitution rates of secondary structure elements. Tested in a varied set of scenarios, from small- to large-scale, BEAM is successful in retrieving structural motifs even in highly noisy data sets, such as those that can arise in CLIP-Seq or other high-throughput experiments.  相似文献   

14.
Statistical methods have been developed for finding local patterns, also called motifs, in multiple protein sequences. The aligned segments may imply functional or structural core regions. However, the existing methods often have difficulties in aligning multiple proteins when sequence residue identities are low (e.g., less than 25%). In this article, we develop a Bayesian model and Markov chain Monte Carlo (MCMC) methods for identifying subtle motifs in protein sequences. Specifically, a motif is defined not only in terms of specific sites characterized by amino acid frequency vectors, but also as a combination of secondary characteristics such as hydrophobicity, polarity, etc. Markov chain Monte Carlo methods are proposed to search for a motif pattern with high posterior probability under the new model. A special MCMC algorithm is developed, involving transitions between state spaces of different dimensions. The proposed methods were supported by a simulated study. It was then tested by two real datasets, including a group of helix-turn-helix proteins, and one set from the CATH Protein Structure Classification Database. Statistical comparisons showed that the new approach worked better than a typical Gibbs sampling approach which is based only on an amino acid model.  相似文献   

15.
Lu CH  Lin YS  Chen YC  Yu CS  Chang SY  Hwang JK 《Proteins》2006,63(3):636-643
To identify functional structural motifs from protein structures of unknown function becomes increasingly important in recent years due to the progress of the structural genomics initiatives. Although certain structural patterns such as the Asp-His-Ser catalytic triad are easy to detect because of their conserved residues and stringently constrained geometry, it is usually more challenging to detect a general structural motifs like, for example, the betabetaalpha-metal binding motif, which has a much more variable conformation and sequence. At present, the identification of these motifs usually relies on manual procedures based on different structure and sequence analysis tools. In this study, we develop a structural alignment algorithm combining both structural and sequence information to identify the local structure motifs. We applied our method to the following examples: the betabetaalpha-metal binding motif and the treble clef motif. The betabetaalpha-metal binding motif plays an important role in nonspecific DNA interactions and cleavage in host defense and apoptosis. The treble clef motif is a zinc-binding motif adaptable to diverse functions such as the binding of nucleic acid and hydrolysis of phosphodiester bonds. Our results are encouraging, indicating that we can effectively identify these structural motifs in an automatic fashion. Our method may provide a useful means for automatic functional annotation through detecting structural motifs associated with particular functions.  相似文献   

16.
Kosloff M  Elia N  Selinger Z 《Biochemistry》2002,41(49):14518-14523
In a family of proteins, often the three-dimensional structure has been experimentally determined only for one member or a few members of the family. Homology modeling can be used to model the structures of all other members of the family and thus allow comparison of these structures. This approach was applied to heterotrimeric G proteins that require anchorage to the plasma membrane to properly interact with membrane-bound receptors and downstream effectors. Lipid modification by palmitoylation is a fundamental contributor to this localization, but the signals leading to this modification are still unknown. In this work, homology models of all the different human G(alpha) paralogs were generated using automated homology modeling, and the electrostatic potential of these proteins was calculated and visualized. This approach identifies a basic, positively charged, structural motif in the N-termini of heterotrimeric G proteins, which is not readily discernible from sequence alone. The basic motif is much reduced in those G(alpha) subunits that also undergo myristoylation, suggesting that the basic patches and myristoylation play overlapping roles. These motifs can affect both membrane affinity and orientation and determine the palmitoylation of G(alpha) subunits in cooperation with the G(betagamma) subunits, as has been corroborated by previous experimental studies. Furthermore, other palmitoylated proteins such as GAP-43 and RGS proteins share this alpha-helical basic motif in their N-terminus. It therefore appears that this structural motif is more widely applicable as a membrane-targeting and palmitoylation-determining signal. The work presented here highlights the possibilities available for experimentalists to discover structural motifs that are not readily observed by analysis of the linear sequence.  相似文献   

17.
Membrane bound members of the M1 family: more than aminopeptidases   总被引:1,自引:0,他引:1  
In mammals the M1 aminopeptidase family consists of nine different proteins, five of which are integral membrane proteins. The aminopeptidases are defined by two motifs in the catalytic domain; a zinc binding motif HEXXH-(X18)-E and an exopeptidase motif GXMEN. Aminopeptidases of this family are able to cleave a broad range of peptides down to only to a single peptide. This ability to either generate or degrade active peptide hormones is the focus of this review. In addition to their capacity to degrade a range of peptides a number of these aminopeptidases have novel functions that impact on cell signalling and will be discussed.  相似文献   

18.
Interresidue protein contacts in proteins structures and at protein-protein interface are classically described by the amino acid types of interacting residues and the local structural context of the contact, if any, is described using secondary structures. In this study, we present an alternate analysis of interresidue contact using local structures defined by the structural alphabet introduced by Camproux et al. This structural alphabet allows to describe a 3D structure as a sequence of prototype fragments called structural letters, of 27 different types. Each residue can then be assigned to a particular local structure, even in loop regions. The analysis of interresidue contacts within protein structures defined using Vorono? tessellations reveals that pairwise contact specificity is greater in terms of structural letters than amino acids. Using a simple heuristic based on specificity score comparison, we find that 74% of the long-range contacts within protein structures are better described using structural letters than amino acid types. The investigation is extended to a set of protein-protein complexes, showing that the similar global rules apply as for intraprotein contacts, with 64% of the interprotein contacts best described by local structures. We then present an evaluation of pairing functions integrating structural letters to decoy scoring and show that some complexes could benefit from the use of structural letter-based pairing functions.  相似文献   

19.
The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号