Ahmad S  Gromiha MM  Sarai A 《Proteins》2003,50(4):629-635
The solvent accessibility of amino acid residues has been predicted in the past by classifying them into exposure states with varying thresholds. This classification provides a wide range of values for the accessible surface area (ASA) within which a residue may fall. Thus far, no attempt has been made to predict real values of ASA from the sequence information without a priori classification into exposure states. Here, we present a new method with which to predict real value ASAs for residues, based on neighborhood information. Our real value prediction neural network could estimate the ASA for four different nonhomologous, nonredundant data sets of varying size, with 18.0-19.5% mean absolute error, defined as per residue absolute difference between the predicted and experimental values of relative ASA. Correlation between the predicted and experimental values ranged from 0.47 to 0.50. It was observed that the ASA of a residue could be predicted within a 23.7% mean absolute error, even when no information about its neighbors is included. Prediction of real values answers the issue of arbitrary choice of ASA state thresholds, and carries more information than category prediction. Prediction error for each residue type strongly correlates with the variability in its experimental ASA values.  相似文献   

Variations of the shape and polarity of the DNA grooves caused by changes of the DNA conformation play an important role in the DNA readout. Despite the fact that non-canonical trans and gauche- conformations of the DNA backbone angle γ (O5′–C5′–C4′–C3′) are frequently found in the DNA crystal structures, their possible role in the DNA recognition has not been studied systematically. In order to fill in this gap, we analyze the available high-resolution crystal structures of the naked and complexed DNA. The analysis shows that the non-canonical γ angle conformations are present both in the naked and bound DNA, more often in the bound vs. naked DNA, and in the nucleotides with the A-like vs. the B-like sugar pucker. The alternative angle γ torsions are more frequently observed in the purines with the A-like sugar pucker and in the pyrimidines with the B-like sugar conformation. The minor groove of the nucleotides with non-canonical γ angle conformation is more polar, while the major groove is more hydrophobic than in the nucleotides with the classical γ torsions due to variations in exposure of the polar and hydrophobic groups of the DNA backbone. The propensity of the nucleotides with different γ angle conformations to participate in the protein–nucleic acid contacts in the minor and major grooves is connected with their sugar pucker and sequence-specific. Our findings imply that the angle γ transitions contribute to the process of the protein–DNA recognition due to modification of the polar/hydrophobic profile of the DNA grooves.  相似文献   

Hamelryck T 《Proteins》2005,59(1):38-48
The concept of amino acid solvent exposure is crucial for understanding and predicting various aspects of protein structure and function. The traditional measures of solvent exposure however suffer from various shortcomings, like for example the inability to distinguish exposed, partly exposed, buried, and deeply buried residues. This article introduces a new measure of solvent exposure called Half-Sphere Exposure that addresses many of the shortcomings of other methods. The new measure outperforms other measures with respect to correlation with protein stability, conservation among fold homologs, amino acid-type dependency and interpretation. The measure consists of the number of Calpha atoms in two half spheres around a residue's Calpha atom. Conceptually, one of the half spheres corresponds to the side chain's neighborhood, the other half sphere being in the opposite direction. We show here that the two half spheres correspond to two regions around an amino acid that are surprisingly distinct in terms of geometry and energy. This aspect of protein structure introduced here forms the basis of the Half-Sphere Exposure measure. The results strongly suggest that in many respects, a 2D measure is inherently much better suited to describe solvent exposure than the traditional 1D measures. Importantly, Half-Sphere Exposure can be calculated from the Calpha atom coordinates only, which abolishes the need for a full-atom model to calculate solvent exposure. Hence, the measure can be used in protein structure prediction methods that are based on various simplified models. Half-Sphere Exposure has great potential for use in protein structure prediction and analysis.  相似文献   

Faraggi E  Xue B  Zhou Y 《Proteins》2009,74(4):847-856
This article attempts to increase the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins through improved learning. Most methods developed for improving the backpropagation algorithm of artificial neural networks are limited to small neural networks. Here, we introduce a guided-learning method suitable for networks of any size. The method employs a part of the weights for guiding and the other part for training and optimization. We demonstrate this technique by predicting residue solvent accessibility and real-value backbone torsion angles of proteins. In this application, the guiding factor is designed to satisfy the intuitive condition that for most residues, the contribution of a residue to the structural properties of another residue is smaller for greater separation in the protein-sequence distance between the two residues. We show that the guided-learning method makes a 2-4% reduction in 10-fold cross-validated mean absolute errors (MAE) for predicting residue solvent accessibility and backbone torsion angles, regardless of the size of database, the number of hidden layers and the size of input windows. This together with introduction of two-layer neural network with a bipolar activation function leads to a new method that has a MAE of 0.11 for residue solvent accessibility, 36 degrees for psi, and 22 degrees for phi. The method is available as a Real-SPINE 3.0 server in http://sparks.informatics.iupui.edu.  相似文献   

The buried surface area (BSA), which measures the size of the interface in a protein–protein complex may differ from the accessible surface area (ASA) lost upon association (which we call DSA), if conformation changes take place. To evaluate the DSA, we measure the ASA of the interface atoms in the bound and unbound states of the components of 144 protein–protein complexes taken from the Protein–Protein Interaction Affinity Database of Kastritis et al. (2011). We observe differences exceeding 20%, and a systematic bias in the distribution. On average, the ASA calculated in the bound state of the components is 3.3% greater than in their unbound state, and the BSA, 7% greater than the DSA. The bias is observed even in complexes where the conformation changes are small. An examination of the bound and unbound structures points to a possible origin: local movements optimize contacts with the other component at the cost of internal contacts, and presumably also the binding free energy.  相似文献   

We present an improved version of RosettaHoles, a methodology for quantitative and visual characterization of protein core packing. RosettaHoles2 features a packing measure more rapidly computable, accurate and physically transparent, as well as a new validation score intended for structures submitted to the Protein Data Bank. The differential packing measure is parameterized to maximize the gap between computationally generated and experimentally determined X‐ray structures, and can be used in refinement of protein structure models. The parameters of the model provide insight into components missing in current force fields, and the validation score gives an upper bound on the X‐ray resolution of Protein Data Bank structures; a crystal structure should have a validation score as good as or better than its resolution.  相似文献   

This review focuses, in a non-exhaustive manner, on the essential structural and conformational features of protein–carbohydrate interactions and on some applications of NMR spectroscopy to deal with this topic from different levels of complexity.  相似文献   

Evaluation of Surface Complementarity, Hydrogen bonding, and Electrostatic interaction in molecular Recognition (ESCHER) is a new docking procedure consisting of three modules that work in series. The first module evaluates the geometric complementarity and produces a set of rough solutions for the docking problem. The second module identifies molecular collisions within those solutions, and the third evaluates their electrostatic complementarity. We describe the algorithm and its application to the docking of cocrystallized protein domains and unbound components of protein-protein complexes. Furthermore, ESCHER has been applied to the reassociation of secondary and supersecondary structure elements. The possibility of applying a docking method to the problem of protein structure prediction is discussed. Proteins 28:556–567, 1997. © 1997 Wiley-Liss, Inc.  相似文献   

Post‐translational modifications (PTMs) represent an important regulatory layer influencing the structure and function of proteins. With broader availability of experimental information on the occurrences of different PTM types, the investigation of a potential “crosstalk” between different PTM types and combinatorial effects have moved into the research focus. Hypothesizing that relevant interferences between different PTM types and sites may become apparent when investigating their mutual physical distances, we performed a systematic survey of pairwise homo‐ and heterotypic distances of seven frequent PTM types considering their sequence and spatial distances in resolved protein structures. We found that actual PTM site distance distributions differ from random distributions with most PTM type pairs exhibiting larger than expected distances with the exception of homotypic phosphorylation site distances and distances between phosphorylation and ubiquitination sites that were found to be closer than expected by chance. Random reference distributions considering canonical acceptor amino acid residues only were found to be shifted to larger distances compared to distances between any amino acid residue type indicating an underlying tendency of PTM‐amenable residue types to be further apart than randomly expected. Distance distributions based on sequence separations were found largely consistent with their spatial counterparts suggesting a primary role of sequence‐based pairwise PTM‐location encoding rather than folding‐mediated effects. Our analysis provides a systematic and comprehensive overview of the characteristics of pairwise PTM site distances on proteins and reveals that, predominantly, PTM sites tend to avoid close proximity with the potential implication that an independent attachment or removal of PTMs remains possible. Proteins 2016; 85:78–92. © 2016 Wiley Periodicals, Inc.  相似文献   

An easy and uncomplicated method to predict the solvent accessibility state of a site in a multiple protein sequence alignment is described. The approach is based on amino acid exchange and compositional preference matrices for each of three accessibility states: buried, exposed, and intermediate. Calculations utilized a modified version of the 3D―ali databank, a collection of multiple sequence alignments anchored through protein tertiary structural superpositions. The technique achieves the same accuracy as much more complex methods and thus provides such advantages as computational affordability, facile updating, and easily understood residue substitution patterns useful to biochemists involved in protein engineering, design, and structural prediction. The program is available from the authors; and, due to its simplicity, the algorithm can be readily implemented on any system. For a given alignment site, a hand calculation can yield a comparative prediction. Proteins 32:190–199, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

Adamian L  Nanda V  DeGrado WF  Liang J 《Proteins》2005,59(3):496-509
Characterizing the interactions between amino acid residues and lipid molecules is important for understanding the assembly of transmembrane helices and for studying membrane protein folding. In this study we develop TMLIP (TransMembrane helix-LIPid), an empirically derived propensity of individual residue types to face lipid membrane based on statistical analysis of high-resolution structures of membrane proteins. Lipid accessibilities of amino acid residues within the transmembrane (TM) region of 29 structures of helical membrane proteins are studied with a spherical probe of radius of 1.9 A. Our results show that there are characteristic preferences for residues to face the headgroup region and the hydrocarbon core region of lipid membrane. Amino acid residues Lys, Arg, Trp, Phe, and Leu are often found exposed at the headgroup regions of the membrane, where they have high propensity to face phospholipid headgroups and glycerol backbones. In the hydrocarbon core region, the strongest preference for interacting with lipids is observed for Ile, Leu, Phe and Val. Small and polar amino acid residues are usually buried inside helical bundles and are strongly lipophobic. There is a strong correlation between various hydrophobicity scales and the propensity of a given residue to face the lipids in the hydrocarbon region of the bilayer. Our data suggest a possibly significant contribution of the lipophobic effect to the folding of membrane proteins. This study shows that membrane proteins have exceedingly apolar exteriors rather than highly polar interiors. Prediction of lipid-facing surfaces of boundary helices using TMLIP1 results in a 54% accuracy, which is significantly better than random (25% accuracy). We also compare performance of TMLIP with another lipid propensity scale, kPROT, and with several hydrophobicity scales using hydrophobic moment analysis.  相似文献   

Yuan Z  Huang B 《Proteins》2004,57(3):558-564
A novel support vector regression (SVR) approach is proposed to predict protein accessible surface areas (ASAs) from their primary structures. In this work, we predict the real values of ASA in squared angstroms for residues instead of relative solvent accessibility. Based on protein residues, the mean and median absolute errors are 26.0 A(2) and 18.87 A(2), respectively. The correlation coefficient between the predicted and observed ASAs is 0.66. Cysteine is the best predicted amino acid (mean absolute error is 13.8 A(2) and median absolute error is 8.37 A(2)), while arginine is the least predicted amino acid (mean absolute error is 42.7 A(2) and median absolute error is 36.31 A(2)). Our work suggests that the SVR approach can be directly applied to the ASA prediction where data preclassification has been used.  相似文献   

Protein folding rates vary by several orders of magnitude and they depend on the topology of the fold and the size and composition of the sequence. Although recent works show that the rates can be predicted from the sequence, allowing for high‐throughput annotations, they consider only the sequence and its predicted secondary structure. We propose a novel sequence‐based predictor, PFR‐AF, which utilizes solvent accessibility and residue flexibility predicted from the sequence, to improve predictions and provide insights into the folding process. The predictor includes three linear regressions for proteins with two‐state, multistate, and unknown (mixed‐state) folding kinetics. PFR‐AF on average outperforms current methods when tested on three datasets. The proposed approach provides high‐quality predictions in the absence of similarity between the predicted and the training sequences. The PFR‐AF's predictions are characterized by high (between 0.71 and 0.95, depending on the dataset) correlation and the lowest (between 0.75 and 0.9) mean absolute errors with respect to the experimental rates, as measured using out‐of‐sample tests. Our models reveal that for the two‐state chains inclusion of solvent‐exposed Ala may accelerate the folding, while increased content of Ile may reduce the folding speed. We also demonstrate that increased flexibility of coils facilitates faster folding and that proteins with larger content of solvent‐exposed strands may fold at a slower pace. The increased flexibility of the solvent‐exposed residues is shown to elongate folding, which also holds, with a lower correlation, for buried residues. Two case studies are included to support our findings. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

A strategy is developed to use database-derived - constraints during simulated annealing procedures for protein solution structure determination in order to improve the Ramachandran plot statistics, while maintaining the agreement with the experimental constraints as the sole criterion for the selection of the family. The procedure, fully automated, consists of two consecutive simulated annealing runs. In the first run, the database-derived - constraints are enforced for all aminoacids (but prolines and glycines). A family of structures is then selected on the ground of the lowest violations of the experimental constraints only, and the - values for each residue are examined. In the second and final run, the database-derived - constraints are enforced only for those residues which in the first run have ended in one and the same favored - region. For residues which are either spread over different favored regions or concentrated in disallowed regions, the constraints are not enforced. The final family is then selected, after the second run, again only based on the agreement with the experimental constraints. This automated approach was implemented in DYANA and was tested on as many as 12 proteins, including some containing paramagnetic metals, whose structures had been previously solved in our laboratory. The quality of the structures, and of Ramachandran plot statistics in particular, was notably improved while preserving the agreement with the experimental constraints.  相似文献   

Minai R  Matsuo Y  Onuki H  Hirota H 《Proteins》2008,72(1):367-381
Many drugs, even ones that are designed to act selectively on a target protein, bind unintended proteins. These unintended bindings can explain side effects or indicate additional mechanisms for a drug's medicinal properties. Structural similarity between binding sites is one of the reasons for binding to multiple targets. We developed a method for the structural alignment of atoms in the solvent-accessible surface of proteins that uses similarities in the local atomic environment, and carried out all-against-all structural comparisons for 48,347 potential ligand-binding regions from a nonredundant protein structure subset (nrPDB, provided by NCBI). The relationships between the similarity of ligand-binding regions and the similarity of the global structures of the proteins containing the binding regions were examined. We found 10,403 known ligand-binding region pairs whose structures were similar despite having different global folds. Of these, we detected 281 region pairs that had similar ligands with similar binding modes. These proteins are good examples of convergent evolution. In addition, we found a significant correlation between Z-score of structural similarity and true positive rate of "active" entries in the PubChem BioAssay database. Moreover, we confirmed the interaction between ibuprofen and a new target, porcine pancreatic elastase, by NMR experiment. Finally, we used this method to predict new drug-target protein interactions. We obtained 540 predictions for 105 drugs (e.g., captopril, lovastatin, flurbiprofen, metyrapone, and salicylic acid), and calculated the binding affinities using AutoDock simulation. The results of these structural comparisons are available at http://www.tsurumi.yokohama-cu.ac.jp/fold/database.html.  相似文献   

On the study of protein inverse folding problem, one goal is to find simple and efficient potential to evaluate the compatibility between structure and a given sequence. We present here a novo empirical mean force potential to address the importance of electrostatic interactions in protein inverse folding study. It is based on protein main chain polar fraction and constructed in a way similar with Sippl's from a database of 64 known independent three-dimensional protein structures. This potential was applied to recognize the protein native conformations among a conformation pool. Calculated results show that this potential is powerful in picking out native conformations, in addition it can also find structure similarity between proteins with low sequence similarity. The success of this new potential clearly shows the importance of electrostatic factors in protein inverse folding studies. © 1995 Wiley-Liss, Inc.  相似文献   

A new topological method to measure protein structure similarity   总被引:5,自引:0,他引:5  
A method for the quantitative evaluation of structural similarity between protein pairs is developed that makes use of a Delaunay-based topological mapping. The result of the mapping is a three-dimensional array which is representative of the global structural topology and whose elements can be used to construe an integral scoring scheme. This scoring scheme was tested for its dependence on the protein length difference in a pairwise comparison, its ability to provide a reasonable means for structural similarity comparison within a family of structural neighbors of similar length, and its sensitivity to the differences in protein conformation. It is shown that such a topological evaluation of similarity is capable of providing insight into these points of interest. Protein structure comparison using the method is computationally efficient and the topological scores, although providing different information about protein similarity, correlate well with the distance root-mean-square deviation values calculated by rigid-body structural alignment.  相似文献   

Solvent accessibility can be used to evaluate protein structural models, identify binding sites, and characterize protein conformational changes. The differential modification of amino acids at specific sites enables the accessible surface residues to be identified by mass spectrometry. Tryptophan residues within proteins can be differentially labeled with halocompounds by a photochemical reaction. In this study, tryptophan residues of carbonic anhydrase are reacted with chloroform, 2,2,2-trichloroethanol (TCE), 2,2,2-trichloroacetate (TCA), or 3-bromo-1-propanol (BP) under UV irradiation at 280 nm. The light-driven reactions with chloroform, TCE, TCA, and BP attach a formyl, hydroxyethanone, carboxylic acid, and propanol group, respectively, onto the indole ring of tryptophan. Trypsin and chymotrypsin digests of the modified carbonic anhydrase are used to map accessible tryptophan residues using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). Tryptophan reactivity is determined by identifying peptides with tryptophan residues modified with the appropriate label. The reactivity is calculated from the frequency that the modification is identified and a semiquantitative measure of the amount of products formed. Both of these measures of tryptophan reactivity correlate significantly with the accessible surface area of tryptophan residues in carbonic anhydrase determined from the X-ray crystal structure. Therefore the photochemical reaction of halocompounds with tryptophan residues in carbonic anhydrase indicates the degree of solvent accessibility of these residues.  相似文献   

The conformational changes in well-characterized model proteins [bovine ribonuclease A (RNase A), horseradish peroxidase, sperm-whole myoglobin, human hemoglobin, and bovine serum albumin (BSA)] upon adsorption on ultrafine polystyrene (PS) particles have been studied using circular dichroism (CD) spectroscopy. These proteins were chosen with special attention to molecular flexibility. The ultrafine PS particles were negatively charged and have average diameters of 20 or 30 nm. Utilization of these ultrafine PS particles makes it possible to apply the CD technique to determine the secondary structure of proteins adsorbed on the PS surface. Effects of protein properties and adsorption conditions on the extent of the changes in the secondary structure of protein molecules upon adsorption on ultrafine PS particles were studied. The CD spectrum changes upon adsorption were significant in the "soft" protein molecules (myoglobin, hemoglobin, and BSA), while they were insingnificant in the "rigid" proteins (RNase A and peroxidase). The soft proteins sustained a marked decrease in alpha-helix content upon adsorption. Moreover, the native alpha-helix content, which is given as the percentage of the alpha-helix content in the free proteins, of adsorbed BSA was found to decrease with decreasing pH and increase with increasing adsorbed amount. These observations confirm some well-known hypotheses for the confirmational chages in protein molecules upon adsorption. (c) 1992 John Wiley & Sons, Inc.  相似文献   

