首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Following our previous work on the analysis of 'structural plasticity' associated with the beta-propeller structural motifs, we have now developed a simple method that can automatically detect all the known beta-propellers in protein tertiary structure, given a list of Protein Data Bank (PDB) codes as input to the computer program. Our beta-propeller detection (BPD) method identifies the location of beta-propellers in the protein structure, specifies the beta-propeller type, the beta-sheet associated beta-strand pattern and the structurally similar beta-propellers observed in other proteins. When tested on 21,566 proteins in the PDB, the BPD method was capable of correctly identifying all the known 245 beta-propellers described in the structural classification of proteins (SCOP) with the number of false positives detected being less than 0.2%. Forty-one false positives were detected that correspond to eight known protein families. When compared with some of the popular web-based programs that can automatically detect 'structural similarities' between the query and target proteins, our method has the advantage of also being capable of detecting beta-propellers associated with 'structural plasticity' and in situations where the target and query proteins differ in amino acid sequence length.  相似文献   

2.
Uncovering the cis-regulatory logic of developmental enhancers is critical to understanding the role of non-coding DNA in development. However, it is cumbersome to identify functional motifs within enhancers, and thus few vertebrate enhancers have their core functional motifs revealed. Here we report a combined experimental and computational approach for discovering regulatory motifs in developmental enhancers. Making use of the zebrafish gene expression database, we computationally identified conserved non-coding elements (CNEs) likely to have a desired tissue-specificity based on the expression of nearby genes. Through a high throughput and robust enhancer assay, we tested the activity of ∼ 100 such CNEs and efficiently uncovered developmental enhancers with desired spatial and temporal expression patterns in the zebrafish brain. Application of de novo motif prediction algorithms on a group of forebrain enhancers identified five top-ranked motifs, all of which were experimentally validated as critical for forebrain enhancer activity. These results demonstrate a systematic approach to discover important regulatory motifs in vertebrate developmental enhancers. Moreover, this dataset provides a useful resource for further dissection of vertebrate brain development and function.  相似文献   

3.
Lu CH  Lin YS  Chen YC  Yu CS  Chang SY  Hwang JK 《Proteins》2006,63(3):636-643
To identify functional structural motifs from protein structures of unknown function becomes increasingly important in recent years due to the progress of the structural genomics initiatives. Although certain structural patterns such as the Asp-His-Ser catalytic triad are easy to detect because of their conserved residues and stringently constrained geometry, it is usually more challenging to detect a general structural motifs like, for example, the betabetaalpha-metal binding motif, which has a much more variable conformation and sequence. At present, the identification of these motifs usually relies on manual procedures based on different structure and sequence analysis tools. In this study, we develop a structural alignment algorithm combining both structural and sequence information to identify the local structure motifs. We applied our method to the following examples: the betabetaalpha-metal binding motif and the treble clef motif. The betabetaalpha-metal binding motif plays an important role in nonspecific DNA interactions and cleavage in host defense and apoptosis. The treble clef motif is a zinc-binding motif adaptable to diverse functions such as the binding of nucleic acid and hydrolysis of phosphodiester bonds. Our results are encouraging, indicating that we can effectively identify these structural motifs in an automatic fashion. Our method may provide a useful means for automatic functional annotation through detecting structural motifs associated with particular functions.  相似文献   

4.
With the recent exponential increase in protein phosphorylation sites identified by mass spectrometry, a unique opportunity has arisen to understand the motifs surrounding such sites. Here we present an algorithm designed to extract motifs from large data sets of naturally occurring phosphorylation sites. The methodology relies on the intrinsic alignment of phospho-residues and the extraction of motifs through iterative comparison to a dynamic statistical background. Results show the identification of dozens of novel and known phosphorylation motifs from recently published serine, threonine and tyrosine phosphorylation studies. When applied to a linguistic data set to test the versatility of the approach, the algorithm successfully extracted hundreds of language motifs. This method, in addition to shedding light on the consensus sequences of identified and as yet unidentified kinases and modular protein domains, may also eventually be used as a tool to determine potential phosphorylation sites in proteins of interest.  相似文献   

5.
Knowledge of three dimensional structure is essential to understand the function of a protein. Although the overall fold is made from the whole details of its sequence, a small group of residues, often called as structural motifs, play a crucial role in determining the protein fold and its stability. Identification of such structural motifs requires sufficient number of sequence and structural homologs to define conservation and evolutionary information. Unfortunately, there are many structures in the protein structure databases have no homologous structures or sequences. In this work, we report an SVM method, SMpred, to identify structural motifs from single protein structure without using sequence and structural homologs. SMpred method was trained and tested using 132 proteins domains containing 581 motifs. SMpred method achieved 78.79% accuracy with 79.06% sensitivity and 78.53% specificity. The performance of SMpred was evaluated with MegaMotifBase using 188 proteins containing 1161 motifs. Out of 1161 motifs, SMpred correctly identified 1503 structural motifs reported in MegaMotifBase. Further, we showed that SMpred is useful approach for the length deviant superfamilies and single member superfamilies. This result suggests the usefulness of our approach for facilitating the identification of structural motifs in protein structure in the absence of sequence and structural homologs. The dataset and executable for the SMpred algorithm is available at http://www3.ntu.edu.sg/home/EPNSugan/index_files/SMpred.htm.  相似文献   

6.
Recurring RNA structural motifs are important sites of tertiary interaction and as such, are integral to RNA macromolecular structure. Although numerous RNA motifs have been classified and characterized, the identification of new motifs is of great interest. In this study, we discovered four new conformationally recurring motifs: the pi-turn, the Omega-turn, the alpha-loop and the C2'-endo mediated flipped adenosine motif. Not only do they have complex and interesting structures, but they participate in contacts of high biological significance. In a first for the RNA field, new motifs were discovered by a fully automated algorithm. This algorithm, COMPADRES, utilized a reduced representation of the RNA backbone and was highly successful at discerning unique structural relationships. This study also shows that recurring RNA substructures are not necessarily accompanied by consistent primary or secondary structure.  相似文献   

7.
8.
The Structural Motifs of Superfamilies (SMoS) database provides information about the structural motifs of aligned protein domain superfamilies. Such motifs among structurally aligned multiple members of protein superfamilies are recognized by the conservation of amino acid preference and solvent inaccessibility and are examined for the conservation of other features like secondary structural content, hydrogen bonding, non-polar interaction and residue packing. These motifs, along with their sequence and spatial orientation, represent the conserved core structure of each superfamily and also provide the minimal requirement of sequence and structural information to retain each superfamily fold.  相似文献   

9.
Proteins are generally classified into four structural classes: all-alpha proteins, all-beta proteins, alpha + beta proteins, and alpha/beta proteins. In this article, a protein is expressed as a vector of 20-dimensional space, in which its 20 components are defined by the composition of its 20 amino acids. Based on this, a new method, the so-called maximum component coefficient method, is proposed for predicting the structural class of a protein according to its amino acid composition. In comparison with the existing methods, the new method yields a higher general accuracy of prediction. Especially for the all-alpha proteins, the rate of correct prediction obtained by the new method is much higher than that by any of the existing methods. For instance, for the 19 all-alpha proteins investigated previously by P.Y. Chou, the rate of correct prediction by means of his method was 84.2%, but the correct rate when predicted with the new method would be 100%! Furthermore, the new method is characterized by an explicable physical picture. This is reflected by the process in which the vector representing a protein to be predicted is decomposed into four component vectors, each of which corresponds to one of the norms of the four protein structural classes.  相似文献   

10.
11.
A systematic approach to the analysis of protein phosphorylation   总被引:29,自引:0,他引:29  
Reversible protein phosphorylation has been known for some time to control a wide range of biological functions and activities. Thus determination of the site(s) of protein phosphorylation has been an essential step in the analysis of the control of many biological systems. However, direct determination of individual phosphorylation sites occurring on phosphoproteins in vivo has been difficult to date, typically requiring the purification to homogeneity of the phosphoprotein of interest before analysis. Thus, there has been a substantial need for a more rapid and general method for the analysis of protein phosphorylation in complex protein mixtures. Here we describe such an approach to protein phosphorylation analysis. It consists of three steps: (1) selective phosphopeptide isolation from a peptide mixture via a sequence of chemical reactions, (2) phosphopeptide analysis by automated liquid chromatography-tandem mass spectrometry (LC-MS/MS), and (3) identification of the phosphoprotein and the phosphorylated residue(s) by correlation of tandem mass spectrometric data with sequence databases. By utilizing various phosphoprotein standards and a whole yeast cell lysate, we demonstrate that the method is equally applicable to serine-, threonine- and tyrosine-phosphorylated proteins, and is capable of selectively isolating and identifying phosphopeptides present in a highly complex peptide mixture.  相似文献   

12.
A systematic method has been developed for comparing the backbone conformations of proteins (Remington & Matthews, 1978). Two proteins are compared by successively optimizing the agreement between all possible segments of a chosen length from one protein, and all possible segments of the same length from the other protein. The method reveals any similarities between the two proteins, and provides an estimate of the statistical significance of any given structure agreement that is obtained.The method has been tested in a number of cases, including comparisons of the dehydrogenases and of the pancreatic and bacterial serine proteases. These examples were chosen to test the ability of the comparison method to detect structural similarities in the presence of large insertions and deletions. The results suggest that the detection of the “nucleotide binding fold” in the dehydrogenases is at the limit of the capability of the comparison technique in its original form, although it may be possible to generalize the method to allow for insertions and deletions in proteins.The results of many protein comparisons, made with different probe lengths, are summarized. For medium and long probe lengths, the average value of the structural agreement does not depend very much on the type of protein being compared. The average value of the structure agreement increases with the square root of the probe length, but for probe lengths above about 40 residues, the standard deviation is independent of probe length. From these observations it is possible to construct a generalized probability diagram to evaluate the significance of any structure agreement that might be obtained in comparing two proteins.  相似文献   

13.
As the number of available three dimensional coordinates of proteins increases, it is now recognized that proteins from different families and topologies are constructed from independent motifs. Detection of specific structural motifs within proteins aids in understanding their role and the mechanism of their operation. To aid in identification and use of these motifs it has become necessary to develop efficient methods for systematic scanning of structural databases. To date, methods of structural protein comparison suffer from at least one of the following limitations: (1) are not fully automated (require human intervention), (2) are limited to relatively similar structures, (3) are constrained to linear alignments of the structures, (4) are sensitive to insertions, deletions or gaps in the sequences or (5) are very time consuming. We present a method to overcome the above limitations. The method discovers and ranks every piece of structural similarity between the structures compared, thus allowing the simultaneous detection of real 3-D motifs in different domains, between domains, in active sites, surfaces etc. The method uses the Geometric Hashing Paradigm which is an efficient technique originally developed for Computer Vision. The algorithm exploits the geometrical constraints of rigid objects, it is especially geared towards recognition of partial structures in rigid objects belonging to large data bases and is straightforwardly parallelizable. Computer Vision techniques are for the first time applied to molecular structure comparison, resulting in an efficient, fully automated tool. The method has been tested in a number of cases, including comparisons of the haemoglobins, immunoglobulins, serine proteinases, calcium binding proteins, DNA binding proteins and others. In all examples our results were equivalent to the published results from previous methods and in some cases additional structural information was obtained by our method.  相似文献   

14.
The thermodynamic stability of a protein provides an experimental metric for the relationship of protein sequence and native structure. We have investigated an approach based on an analysis of the structural database for stability engineering of an immunoglobulin variable domain. The most frequently occurring residues in specific positions of beta-turn motifs were predicted to increase the folding stability of mutants that were constructed by site-directed mutagenesis. Even in positions in which different residues are conserved in immunoglobulin sequences, the predictions were confirmed. Frequently, mutants with increased beta-turn propensities display increased folding cooperativities, suggesting pronounced effects on the unfolded state independent of the expected effect on conformational entropy. We conclude that structural motifs with predominantly local interactions can serve as templates with which patterns of sequence preferences can be extracted from the database of protein structures. Such preferences can predict the stability effects of mutations for protein engineering and design.  相似文献   

15.
An approach to the systematic analysis of urinary steroids   总被引:2,自引:1,他引:1       下载免费PDF全文
1. Human urine, its extracts, extracts of urine pretreated with enzyme preparations containing β-glucuronidase and steroid sulphatase or β-glucuronidase alone, and products derived from the specific solvolysis of urinary steroid sulphates, were submitted to the following sequence of operations: reduction with borohydride; oxidation with a glycol-cleaving agent (bismuthate or periodate); separation of the products into ketones and others; oxidation of each fraction with tert.-butyl chromate, resolution of the end products by means of paper chromatography or gas–liquid chromatography or both. 2. Qualitative experiments indicated the kind of information the method and some of its modifications can provide. Quantitative experiments were restricted to the direct treatment of urine by the basic procedure outlined. It was partly shown and partly argued that the quantitative results were probably as informative about the composition of the major neutral urinary steroids (and certainly about their presumptive secretory precursors) as those obtained by a number of established analytical procedures. 3. A possible extension of the scope of the reported method was indicated. 4. A simple technique was introduced for the quantitative deposition of a solid sample on to a gas–liquid-chromatographic column.  相似文献   

16.
Crystallization has recently emerged as a suitable process for the manufacture of biocatalysts in the form of cross-linked enzyme crystals (CLECs) or for the recovery of proteins from fermentation broths. In both instances it is essential to define conditions which control crystal size and habit, and that yield a reliable recovery of the active protein. Experiments to define the crystallization conditions usually depend on a factorial design (either incomplete or sparse matrix) or reverse screening techniques. In this work, we describe a simple procedure that allows the effect of three factors, for example protein concentration, precipitant concentration and pH, to be varied simultaneously and smoothly over a wide range. The results are mapped onto a simple triangular diagram where a 'window of crystallization' is immediately apparent, and that conveniently describes variations either in the crystal features, such as their yield, size, and habit, or in the recovery of biological activity. The approach is illustrated with two enzymes, yeast alcohol dehydrogenase (ADH I) and Candida rugosa lipase. For ADH the formation of two crystal habits (rod and hexagonal) could be controlled as a function of pH (6.5-10) and temperature (4-25 degrees C). At pH 7, in 10 to 16% w/v polyethylene glycol (PEG) 4000, only rod-shaped crystals formed whereas at pH 8, in 10 to 14% w/v PEG, only hexagonal crystals existed. For both enzymes, catalyst recovery was greatest at high crystallization agent concentrations and low protein concentration. For ADH, the greatest activity recovery was 87% whereas for the lipase crystals, by using 45% v/v 2-methyl-2,4-pentanediol (MPD) as the crystallization agent, a crystal recovery of 250 crystals per μl was obtained. For the lipase system, the use of crystal seeding was also shown to increase the crystal recovery by up to a factor of four. From the crystallization windows, the original conditions based on literature precedent (35% v/v MPD, 1 mM CaCl(2), 1.8 mg protein/ml) were altered (47.5% v/v MPD, 2 mM CaCl(2), 3 mg protein/ml). This led to an improved recovery of the lipase under conditions that scale reliably from 0.5 ml to 500 ml with no change in size, shape or recovery of the crystals themselves. Finally, these crystals were crosslinked with 5% v/v glutaraldehyde and mass and activity balances were calculated for the entire process of CLEC production. Up to 35% of the lipase activity present in the crude solid was finally recovered in the lipase CLECs after propan-2-ol fractionation, crystallization, and crosslinking.  相似文献   

17.
We present an update of our method for systematic detection and evaluation of potential helix-turn-helix DNA-binding motifs in protein sequences [Dodd, I. and Egan, J. B. (1987) J. Mol. Biol. 194, 557-564]. The new method is considerably more powerful, detecting approximately 50% more likely helix-turn-helix sequences without an increase in false predictions. This improvement is due almost entirely to the use of a much larger reference set of 91 presumed helix-turn-helix sequences. The scoring matrix derived from this reference set has been calibrated against a large protein sequence database so that the score obtained by a sequence can be used to give a practical estimation of the probability that the sequence is a helix-turn-helix motif.  相似文献   

18.
19.
A specific treatment of recurrent structural motifs that represent the local bias information has been proven to be an important ingredient in de novo protein structure predication. Significant majority of methods for local structure are based on building blocks, which still suffer from its inherent discrete nature. Instead of using building blocks, this work presents a new protocol framework for local structural motifs prediction based on the direct locating along protein sequence and probabilistic sampling in a continuous (φ, ψ) space. The protein sequence was first scanned by an algorithm of sliding window with variable length of 7 to 19 residues, to match local segments to one of 82 motifs patterns in the fragment library. Identified segments were then labeled and modeled as the correlations of backbone torsion angles with mixture of bivariate cosine distributions in continuous (φ, ψ) space. 3D conformations of corresponding segments were finally sampled by using a backtrack algorithm to the hidden Markov model with single output of (φ, ψ). For local motifs in 50 proteins of testing set, about 62% of eight-residue segments located with high confidence value were predicted within 1.5 ? of their native structures by the method. Majority of local structural motifs were identified and sampled, which indicates the proposed protocol may at least serve as the foundation to obtain better protein tertiary structure prediction.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号