共查询到20条相似文献,搜索用时 78 毫秒
1.
We present here a simple approach to identify domain boundaries in proteins of an unknown three-dimensional structure. Our method is based on the hypothesis that a high-side chain entropy of a region in a protein chain must be compensated by a high-residue interaction energy within the region, which could correlate with a well-structured part of the globule, that is, with a domain unit. For protein domains, this means that the domain boundary is conditioned by amino acid residues with a small value of side chain entropy, which correlates with the side chain size. On the one hand, relatively high Ala and Gly content on the domain boundary results in high conformational entropy of the backbone chain between the domains. On the other hand, the presence of Pro residues leads to the formation of hinges for a relative orientation of domains. The method was applied to 646 proteins with two contiguous domains extracted from the SCOP database with a success rate of 63%. We also report the prediction of domain boundaries for CASP5 targets obtained with the same method. 相似文献
2.
Successful prediction of protein domain boundaries provides valuable information not only for the computational structure prediction of multidomain proteins but also for the experimental structure determination. Since protein sequences of multiple domains may contain much information regarding evolutionary processes such as gene-exon shuffling, this information can be detected by analyzing the position-specific scoring matrix (PSSM) generated by PSI-BLAST. We have presented a method, PPRODO (Prediction of PROtein DOmain boundaries) that predicts domain boundaries of proteins from sequence information by a neural network. The network is trained and tested using the values obtained from the PSSM generated by PSI-BLAST. A 10-fold cross-validation technique is performed to obtain the parameters of neural networks using a nonredundant set of 522 proteins containing 2 contiguous domains. PPRODO provides good and consistent results for the prediction of domain boundaries, with accuracy of about 66% using the +/-20 residue criterion. The PPRODO source code, as well as all data sets used in this work, are available from http://gene.kias.re.kr/ approximately jlee/pprodo/. 相似文献
3.
Metal ions are crucial for protein function. They participate in enzyme catalysis, play regulatory roles, and help maintain protein structure. Current tools for predicting metal-protein interactions are based on proteins crystallized with their metal ions present (holo forms). However, a majority of resolved structures are free of metal ions (apo forms). Moreover, metal binding is a dynamic process, often involving conformational rearrangement of the binding pocket. Thus, effective predictions need to be based on the structure of the apo state. Here, we report an approach that identifies transition metal-binding sites in apo forms with a resulting selectivity >95%. Applying the approach to apo forms in the Protein Data Bank and structural genomics initiative identifies a large number of previously unknown, putative metal-binding sites, and their amino acid residues, in some cases providing a first clue to the function of the protein. 相似文献
4.
5.
Using a recently developed protein folding algorithm, a prediction of the tertiary structure of the KIX domain of the CREB binding protein is described. The method incorporates predicted secondary and tertiary restraints derived from multiple sequence alignments in a reduced protein model whose conformational space is explored by Monte Carlo dynamics. Secondary structure restraints are provided by the PHD secondary structure prediction algorithm that was modified for the presence of predicted U-turns, i.e., regions where the chain reverses global direction. Tertiary restraints are obtained via a two-step process: First, seed side-chain contacts are identified from a correlated mutation analysis, and then, a threading-based algorithm expands the number of these seed contacts. Blind predictions indicate that the KIX domain is a putative three-helix bundle, although the chirality of the bundle could not be uniquely determined. The expected root-mean-square deviation for the correct chirality of the KIX domain is between 5.0 and 6.2 Å. This is to be compared with the estimate of 12.9 Å that would be expected by a random prediction, using the model of F. Cohen and M. Sternberg (J. Mol. Biol. 138:321–333, 1980). Proteins 30:287–294, 1998. © 1998 Wiley-Liss, Inc. 相似文献
6.
For many years it has been accepted that the sequence of a protein can specify its three-dimensional structure. However, there has been limited progress in explaining how the sequence dictates its fold and no attempt to do this computationally without the use of specific structural data has ever succeeded for any protein larger than 100 residues. We describe a method that can predict complex folds up to almost 200 residues using only basic principles that do not include any elements of sequence homology. The method does not simulate the folding chain but generates many thousands of models based on an idealized representation of structure. Each rough model is scored and the best are refined. On a set of five proteins, the correct fold score well and when tested on a set of larger proteins, the correct fold was ranked highest for some proteins more than 150 residues, with others being close topological variants. All other methods that approach this level of success rely on the use of templates or fragments of known structures. Our method is unique in using a database of ideal models based on general packing rules that, in spirit, is closer to an ab initio approach. 相似文献
7.
O. V. Galzitskaya N. V. Dovidchenko M. Yu. Lobanov S. O. Garbuzynskiy 《Molecular Biology》2006,40(1):96-106
A database of 452 two-domain proteins with less than 25% homology was constructed. One half of the database was used to obtain statistics on the appearance of amino acid residues at domain boundaries. Small and hydrophilic residues (proline, glycine, asparagine, glutamic acid, arginine, etc.) occurred more often at domain boundaries than in total proteins. Hydrophobic residues (tryptophan, methionine, phenylalanine, etc.) were rarer at domain boundaries than in total proteins. Probability scales of amino acid appearance in boundary-flanking regions were constructed with these statistics and used to predict the domain boundaries in proteins of the other half of the database. The probability scale obtained by averaging the appearance of amino acids over an 8-residue region (±4 residues from the real domain boundaries) yielded the best results: domain boundaries were predicted within 40 residues of the real boundary in 57% of proteins and within 20 residues of the real boundary in 41% of proteins. The probability scale was used to predict the domain boundaries in proteins with unknown structures (CASP6). 相似文献
8.
Jen Tsi Yang 《Journal of Protein Chemistry》1996,15(2):185-191
The conformational parametersP
k
for each amino acid species (j=1–20) of sequential peptides in proteins are presented as the product ofP
i,k
, wherei is the number of the sequential residues in thekth conformational state (k=-helix,-sheet,-turn, or unordered structure). Since the average parameter for ann-residue segment is related to the average probability of finding the segment in the kth state, it becomes a geometric mean of (P
k
)av=(P
i,k
)
1/n
with amino acid residuei increasing from 1 ton. We then used ln(Pk)av to convert a multiplicative process to a summation, i.e., ln(P
k
)
av
=(1/n)P
i,k
(i=1 ton) for ease of operation. However, this is unlike the popular Chou-Fasman algorithm, which has the flaw of using the arithmetic mean for relative probabilities. The Chou-Fasman algorithm happens to be close to our calculations in many cases mainly because the difference between theirP
k
and our InP
k
is nearly constant for about one-half of the 20 amino acids. When stronger conformation formers and breakers exist, the difference become larger and the prediction at the N- and C-terminal-helix or-sheet could differ. If the average conformational parameters of the overlapping segments of any two states are too close for a unique solution, our calculations could lead to a different prediction. 相似文献
9.
Prediction of structures of multidomain proteins from structures of the individual domains 下载免费PDF全文
Wollacott AM Zanghellini A Murphy P Baker D 《Protein science : a publication of the Protein Society》2007,16(2):165-175
We describe the development of a method for assembling structures of multidomain proteins from structures of isolated domains. The method consists of an initial low-resolution search in which the conformational space of the domain linker is explored using the Rosetta de novo structure prediction method, followed by a high-resolution search in which all atoms are treated explicitly and backbone and side chain degrees of freedom are simultaneously optimized. The method recapitulates, often with very high accuracy, the structures of existing multidomain proteins. 相似文献
10.
Alexander Bujotzek James Dunbar Florian Lipsmeier Wolfgang Schäfer Iris Antes Charlotte M. Deane Guy Georges 《Proteins》2015,83(4):681-695
The antigen‐binding site of antibodies forms at the interface of their two variable domains, VH and VL, making VH–VL domain orientation a factor that codetermines antibody specificity and affinity. Preserving VH–VL domain orientation in the process of antibody engineering is important in order to retain the original antibody properties, and predicting the correct VH–VL orientation has also been recognized as an important factor in antibody homology modeling. In this article, we present a fast sequence‐based predictor that predicts VH–VL domain orientation with Q2 values ranging from 0.54 to 0.73 on the evaluation set. We describe VH–VL orientation in terms of the six absolute ABangle parameters that have recently been proposed as a means to separate the different degrees of freedom of VH–VL domain orientation. In order to assess the impact of adjusting VH–VL orientation according to our predictions, we use the set of antibody structures of the recently published Antibody Modeling Assessment (AMA) II study. In comparison to the original AMAII homology models, we find an improvement in the accuracy of VH–VL orientation modeling, which also translates into an improvement in the average root‐mean‐square deviation with regard to the crystal structures. Proteins 2015; 83:681–695. © 2015 Wiley Periodicals, Inc. 相似文献
11.
The delineation of domain boundaries of a given sequence in the absence of known 3D structures or detectable sequence homology to known domains benefits many areas in protein science, such as protein engineering, protein 3D structure determination and protein structure prediction. With the exponential growth of newly determined sequences, our ability to predict domain boundaries rapidly and accurately from sequence information alone is both essential and critical from the viewpoint of gene function annotation. Anyone attempting to predict domain boundaries for a single protein sequence is invariably confronted with a plethora of databases that contain boundary information available from the internet and a variety of methods for domain boundary prediction. How are these derived and how well do they work? What definition of 'domain' do they use? We will first clarify the different definitions of protein domains, and then describe the available public databases with domain boundary information. Finally, we will review existing domain boundary prediction methods and discuss their strengths and weaknesses. 相似文献
12.
13.
颗粒裂解肽G13结构域的重组表达及蛋白质结构预测 总被引:1,自引:0,他引:1
基因工程构建表达是获得抗菌肽的一种成本较低的方法,本实验人工合成G13结构域编码DNA序列,PCR扩增后,用T-A克隆法与pBAD/TOPO ThioFusion表达载体连接,通过PCR鉴定筛选出正确重组质粒,在大肠杆菌Top10中对目的蛋白进行表达,大肠杆菌工程菌经阿拉伯糖诱导后取样,用SDS-PAGE检测表达情况,采用生物信息学方法对表达蛋白的结构特征进行模拟分析。结果显示:目的蛋白在原核系统中实现了高效表达,表达量高达67%以上,主要以包涵体形式表达。蛋白结构预测结果显示,目的蛋白原有的α螺旋活性结构无改变,从而为抗菌肽高效生产提供了有效可靠的研究途径。 相似文献
14.
15.
The overall function of a multi‐domain protein is determined by the functional and structural interplay of its constituent domains. Traditional sequence alignment‐based methods commonly utilize domain‐level information and provide classification only at the level of domains. Such methods are not capable of taking into account the contributions of other domains in the proteins, and domain‐linker regions and classify multi‐domain proteins. An alignment‐free protein sequence comparison tool, CLAP (CLAssification of Proteins) was previously developed in our laboratory to especially handle multi‐domain protein sequences without a requirement of defining domain boundaries and sequential order of domains. Through this method we aim to achieve a biologically meaningful classification scheme for multi‐domain protein sequences. In this article, CLAP‐based classification has been explored on 5 datasets of multi‐domain proteins and we present detailed analysis for proteins containing (1) Tyrosine phosphatase and (2) SH3 domain. At the domain‐level CLAP‐based classification scheme resulted in a clustering similar to that obtained from an alignment‐based method. CLAP‐based clusters obtained for full‐length datasets were shown to comprise of proteins with similar functions and domain architectures. Our study demonstrates that multi‐domain proteins could be classified effectively by considering full‐length sequences without a requirement of identification of domains in the sequence. 相似文献
16.
The knowledge of protein and domain interactions provide crucial insights into their function within a cell. Several computational methods have been proposed to detect interactions between proteins and their constitutive domains. In this work, we focus on approaches based on correlated evolution (coevolution) of sequences of interacting proteins. In this type of approach, often referred to as the mirrortree method, a high correlation of evolutionary histories of two proteins is used as an indicator to predict protein interactions. Recently, it has been observed that subtracting the underlying speciation process by separating coevolution due to common speciation divergence from that due to common function of interacting pairs greatly improves the predictive power of the mirrortree approach. In this article, we investigate possible improvements and limitations of this method. In particular, we demonstrate that the performance of the mirrortree method that can be further improved by restricting the coevolution analysis to the relatively conserved regions in the protein domain sequences (disregarding highly divergent regions). We provide a theoretical validation of our results leading to new insights into the interplay between coevolution and speciation of interacting proteins. 相似文献
17.
18.
Pokarowski P Kloczkowski A Jernigan RL Kothari NS Pokarowska M Kolinski A 《Proteins》2005,59(1):49-57
We have analyzed 29 different published matrices of protein pairwise contact potentials (CPs) between amino acids derived from different sets of proteins, either crystallographic structures taken from the Protein Data Bank (PDB) or computer-generated decoys. Each of the CPs is similar to 1 of the 2 matrices derived in the work of Miyazawa and Jernigan (Proteins 1999;34:49-68). The CP matrices of the first class can be approximated with a correlation of order 0.9 by the formula e(ij) = h(i) + h(j), 1 相似文献
19.
20.
牙鲆碱性磷酸酶cDNA序列分析与蛋白质高级结构预测 总被引:1,自引:0,他引:1
为研究碱性磷酸酶(EC 3.1.3.1; alkaline phosphatase,ALP)在牙鲆(Paralichthys Olivaceus)发育和变态中的作用,采用RACE的方法克隆了牙鲆ALP基因cDNA全长,通过生物信息学分析了核苷酸序列并进行蛋白结构预测. 结果表明,牙鲆ALP cDNA全长为1 811bp,能编码476个氨基酸的蛋白质,分子量为52 293.1,等电点为7.67. 编码区核苷酸GC含量在ALP同源基因中差异比较大,脊椎动物明显高于非脊椎动物和细菌. 分子系统分析显示,牙鲆ALP和青黑斑河豚(Tetraodon nigroviridis)、斑马鱼(Danio rerio)的组织非特异性ALP有较高的同源性,分子进化树和物种进化树是一致的. 在蛋白序列中的一些重要的功能位点,包括金属离子结合位点、N糖基化位点和丝氨酸磷酸化位点等表现了较高的保守性. 牙鲆ALP和人胎盘ALP(PALP)在蛋白序列上有43%的相似性,其3D结构非常接近.通过氨基酸空间位置比较发现,牙鲆ALP中141和203位半胱氨酸对应于人PALP的121和183位半胱氨酸,推测能形成一个二硫键. 在两者酶活性中心,3个金属离子结合的氨基酸残基非常保守,Zn离子周围的9个氨基酸中有2个不同;Mg离子周围的7个氨基酸也只有2个不同,包括一对类似的丝氨酸155和苏氨酸175. 相似文献