首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
Ishida T  Nakamura S  Shimizu K 《Proteins》2006,64(4):940-947
We developed a novel knowledge-based residue environment potential for assessing the quality of protein structures in protein structure prediction. The potential uses the contact number of residues in a protein structure and the absolute contact number of residues predicted from its amino acid sequence using a new prediction method based on a support vector regression (SVR). The contact number of an amino acid residue in a protein structure is defined by the number of residues around a given residue. First, the contact number of each residue is predicted using SVR from an amino acid sequence of a target protein. Then, the potential of the protein structure is calculated from the probability distribution of the native contact numbers corresponding to the predicted ones. The performance of this potential is compared with other score functions using decoy structures to identify both native structure from other structures and near-native structures from nonnative structures. This potential improves not only the ability to identify native structures from other structures but also the ability to discriminate near-native structures from nonnative structures.  相似文献   

2.
We introduce a statistical method for evaluating atomic level 3D interaction patterns of protein-ligand contacts. Such patterns can be used for fast separation of likely ligand and ligand binding site combinations out of all those that are geometrically possible. The practical purpose of this probabilistic method is for molecular docking and scoring, as an essential part of a scoring function. Probabilities of interaction patterns are calculated conditional on structural x-ray data and predefined chemical classification of molecular fragment types. Spatial coordinates of atoms are modeled using a Bayesian statistical framework with parametric 3D probability densities. The parameters are given distributions a priori, which provides the possibility to update the densities of model parameters with new structural data and use the parameter estimates to create a contact hierarchy. The contact preferences can be defined for any spatial area around a specified type of fragment. We compared calculated contact point hierarchies with the number of contact atoms found near the contact point in a reference set of x-ray data, and found that these were in general in a close agreement. Additionally, using substrate binding site in cathechol-O-methyltransferase and 27 small potential binder molecules, it was demonstrated that these probabilities together with auxiliary parameters separate well ligands from decoys (true positive rate 0.75, false positive rate 0). A particularly useful feature of the proposed Bayesian framework is that it also characterizes predictive uncertainty in terms of probabilities, which have an intuitive interpretation from the applied perspective.  相似文献   

3.
Lin CP  Huang SW  Lai YL  Yen SC  Shih CH  Lu CH  Huang CC  Hwang JK 《Proteins》2008,72(3):929-935
It has recently been shown that in proteins the atomic mean-square displacement (or B-factor) can be related to the number of the neighboring atoms (or protein contact number), and that this relationship allows one to compute the B-factor profiles directly from protein contact number. This method, referred to as the protein contact model, is appealing, since it requires neither trajectory integration nor matrix diagonalization. As a result, the protein contact model can be applied to very large proteins and can be implemented as a high-throughput computational tool to compute atomic fluctuations in proteins. Here, we show that this relationship can be further refined to that between the atomic mean-square displacement and the weighted protein contact-number, the weight being the square of the reciprocal distance between the contacting pair. In addition, we show that this relationship can be utilized to compute the cross-correlation of atomic motion (the B-factor is essentially the auto-correlation of atomic motion). For a nonhomologous dataset comprising 972 high-resolution X-ray protein structures (resolution <2.0 A and sequence identity <25%), the mean correlation coefficient between the X-ray and computed B-factors based on the weighted protein contact-number model is 0.61, which is better than those of the original contact-number model (0.51) and other methods. We also show that the computed correlation maps based on the weighted contact-number model are globally similar to those computed through normal model analysis for some selected cases. Our results underscore the relationship between protein dynamics and protein packing. We believe that our method will be useful in the study of the protein structure-dynamics relationship.  相似文献   

4.

Background  

Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C β atoms in other residues within a sphere around the C β atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence.  相似文献   

5.
The sizes of atomic groups are a fundamental aspect of protein structure. They are usually expressed in terms of standard sets of radii for atomic groups and of volumes for both these groups and whole residues. Atomic groups, which subsume a heavy-atom and its covalently attached hydrogen atoms into one moiety, are used because the positions of hydrogen atoms in protein structures are generally not known. We have calculated new values for the radii of atomic groups and for the volumes of atomic groups. These values should prove useful in the analysis of protein packing, protein recognition and ligand design. Our radii for atomic groups were derived from intermolecular distance calculations on a large number (approximately 30,000) of crystal structures of small organic compounds that contain the same atomic groups to those found in proteins. Our radii show significant differences to previously reported values. We also use this new radii set to determine the packing efficiency in different regions of the protein interior. This analysis shows that, if the surface water molecules are included in the calculations, the overall packing efficiency throughout the protein interior is high and fairly uniform. However, if the water structure is removed, the packing efficiency in peripheral regions of the protein interior is underestimated, by approximately 3.5 %.  相似文献   

6.
7.
A method is described to dock a ligand into a binding site in a protein on the basis of the complementarity of the inter-molecular atomic contacts. Docking is performed by maximization of a complementarity function that is dependent on atomic contact surface area and the chemical properties of the contacting atoms. The generality and simplicity of the complementarity function ensure that a wide range of chemical structures can be handled. The ligand and the protein are treated as rigid bodies, but displacement of a small number of residues lining the ligand binding site can be taken into account. The method can assist in the design of improved ligands by indicating what changes in complementarity may occur as a result of the substitution of an atom in the ligand. The capabilities of the method are demonstrated by application to 14 protein–ligand complexes of known crystal structure. © 1996 Wiley Liss, Inc.  相似文献   

8.
Knowledge‐based methods for analyzing protein structures, such as statistical potentials, primarily consider the distances between pairs of bodies (atoms or groups of atoms). Considerations of several bodies simultaneously are generally used to characterize bonded structural elements or those in close contact with each other, but historically do not consider atoms that are not in direct contact with each other. In this report, we introduce an information‐theoretic method for detecting and quantifying distance‐dependent through‐space multibody relationships between the sidechains of three residues. The technique introduced is capable of producing convergent and consistent results when applied to a sufficiently large database of randomly chosen, experimentally solved protein structures. The results of our study can be shown to reproduce established physico‐chemical properties of residues as well as more recently discovered properties and interactions. These results offer insight into the numerous roles that residues play in protein structure, as well as relationships between residue function, protein structure, and evolution. The techniques and insights presented in this work should be useful in the future development of novel knowledge‐based tools for the evaluation of protein structure. Proteins 2014; 82:3450–3465. © 2014 Wiley Periodicals, Inc.  相似文献   

9.
Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the Cα positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment‐related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side‐chain atoms, or side‐chain centroid. To know whether the Cα atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared Cα atoms with other substructures in their contributions to the sequence conservation. Our results show that Cα positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between Cα atoms and the other substructures are high, yielding similar structure–conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between Cα and all‐atom substructures. These results indicate that only Cα atoms of a protein structure could reflect sequence conservation at the residue level.  相似文献   

10.
This study is aimed at showing that considering only nonlocal interactions (interactions of two atoms with a sequence separation larger than five amino acids) extracted using Delaunay tessellation is sufficient and accurate for protein fold recognition. An atomic knowledge‐based potential was extracted based on a Delaunay tessellation with 167 atom types from a sample of the native structures and the normalized energy was calculated for only nonlocal interactions in each structure. The performance of this method was tested on several decoy sets and compared to a method considering all interactions extracted by Delaunay tessellation and three other popular scoring functions. Features such as the contents of different types of interactions and atoms with the highest number of interactions were also studied. The results suggest that considering only nonlocal interactions in a Delaunay tessellation of protein structure is a discrete structure catching deep properties of the three‐dimensional protein data. Proteins 2014; 82:415–423. © 2013 Wiley Periodicals, Inc.  相似文献   

11.
Statistical potentials based on pairwise interactions between C alpha atoms are commonly used in protein threading/fold-recognition attempts. Inclusion of higher order interaction is a possible means of improving the specificity of these potentials. Delaunay tessellation of the C alpha-atom representation of protein structure has been suggested as a means of defining multi-body interactions. A large number of parameters are required to define all four-body interactions of 20 amino acid types (20(4) = 160,000). Assuming that residue order within a four-body contact is irrelevant reduces this to a manageable 8,855 parameters, using a nonredundant dataset of 608 protein structures. Three lines of evidence support the significance and utility of the four-body potential for sequence-structure matching. First, compared to the four-body model, all lower-order interaction models (three-body, two-body, one-body) are found statistically inadequate to explain the frequency distribution of residue contacts. Second, coherent patterns of interaction are seen in a graphic presentation of the four-body potential. Many patterns have plausible biophysical explanations and are consistent across sets of residues sharing certain properties (e.g., size, hydrophobicity, or charge). Third, the utility of the multi-body potential is tested on a test set of 12 same-length pairs of proteins of known structure for two protocols: Sequence-recognizes-structure, where a query sequence is threaded (without gap) through the native and a non-native structure; and structure-recognizes-sequence, where a query structure is threaded by its native and another non-native sequence. Using cross-validated training, protein sequences correctly recognized their native structure in all 24 cases. Conversely, structures recognized the native sequence in 23 of 24 cases. Further, the score differences between correct and decoy structures increased significantly using the three- or four-body potential compared to potentials of lower order.  相似文献   

12.
13.
In the absence of experimentally determined protein structure many biological questions can be addressed using computational structural models. However, the utility of protein structural models depends on their quality. Therefore, the estimation of the quality of predicted structures is an important problem. One of the approaches to this problem is the use of knowledge‐based statistical potentials. Such methods typically rely on the statistics of distances and angles of residue‐residue or atom‐atom interactions collected from experimentally determined structures. Here, we present VoroMQA (Voronoi tessellation‐based Model Quality Assessment), a new method for the estimation of protein structure quality. Our method combines the idea of statistical potentials with the use of interatomic contact areas instead of distances. Contact areas, derived using Voronoi tessellation of protein structure, are used to describe and seamlessly integrate both explicit interactions between protein atoms and implicit interactions of protein atoms with solvent. VoroMQA produces scores at atomic, residue, and global levels, all in the fixed range from 0 to 1. The method was tested on the CASP data and compared to several other single‐model quality assessment methods. VoroMQA showed strong performance in the recognition of the native structure and in the structural model selection tests, thus demonstrating the efficacy of interatomic contact areas in estimating protein structure quality. The software implementation of VoroMQA is freely available as a standalone application and as a web server at http://bioinformatics.lt/software/voromqa . Proteins 2017; 85:1131–1145. © 2017 Wiley Periodicals, Inc.  相似文献   

14.
Structural features of protein-nucleic acid recognition sites   总被引:3,自引:0,他引:3  
Nadassy K  Wodak SJ  Janin J 《Biochemistry》1999,38(7):1999-2017
  相似文献   

15.
A comprehensive statistical analysis of residue-residue contacts and residue environment in protein 3-D structures is presented. In the present work the range of interresidue interactions (effective radius of influence) in tertiary structures of proteins is examined and found to be 10 Å. This result is obtained by correlating the average number of residues within a spherical volume of different radii (contact numbers) with hydrophobicity. Best correlations are obtained with a radius of 10 Å. The same result is obtained when (i) only long-range interactions are considered and (ii) representative side chain atoms are used to indicate the tertiary structure instead of the usual representation of Cα atoms. Residue environment has been investigated using similar methods. Environmental hydrophobicity varies within only a small range of all residue types. Other physicochemical properties also exhibit similar trends of variation, and only five hydrophobic residues (Leu, Val, Met, Phe and Ile) produce a decrement of around 10% from the expected mean of the physicochemical distance between a residue type and its average environment. An information theory approach is proposed to compare domains, which takes into account the effective radius of influence of residues and sequence similarity.  相似文献   

16.
Standard volumes for atoms in double-stranded B-DNA are derived using high resolution crystal structures from the Nucleic Acid Database (NDB) and compared with corresponding values derived from crystal structures of small organic compounds in the Cambridge Structural Database (CSD). Two different methods are used to compute these volumes: the classical Voronoi method, which does not depend on the size of atoms, and the related Radical Planes method which does. Results show that atomic groups buried in the interior of double-stranded DNA are, on average, more tightly packed than in related small molecules in the CSD. The packing efficiency of DNA atoms at the interfaces of 25 high resolution protein-DNA complexes is determined by computing the ratios between the volumes of interfacial DNA atoms and the corresponding standard volumes. These ratios are found to be close to unity, indicating that the DNA atoms at protein-DNA interfaces are as closely packed as in crystals of B-DNA. Analogous volume ratios, computed for buried protein atoms, are also near unity, confirming our earlier conclusions that the packing efficiency of these atoms is similar to that in the protein interior. In addition, we examine the number, volume and solvent occupation of cavities located at the protein-DNA interfaces and compared them with those in the protein interior. Cavities are found to be ubiquitous in the interfaces as well as inside the protein moieties. The frequency of solvent occupation of cavities is however higher in the interfaces, indicating that those are more hydrated than protein interiors. Lastly, we compare our results with those obtained using two different measures of shape complementarity of the analysed interfaces, and find that the correlation between our volume ratios and these measures, as well as between the measures themselves, is weak. Our results indicate that a tightly packed environment made up of DNA, protein and solvent atoms plays a significant role in protein-DNA recognition.  相似文献   

17.
The purpose of this article is to introduce a novel model for discriminating correctly folded proteins from well designed decoy structures using mechanical interatomic forces. In our model, we consider a protein as a collection of springs and the force imposed to each atom is calculated. A potential function is obtained from statistical contact preferences within known protein structures. Combining this function with the spring equation, the interatomic forces are calculated. Finally, we consider a structure and define a score function on the 3D structure of a protein. We compare the force imposed to each atom of a protein with the corresponding atom in the other structures. We then assign larger scores to those atoms with lower forces. The total score is the sum of partial scores of atoms. The optimal structure is assumed to be the one with the highest score in the data set. To evaluate the performance of our model, we apply it on several decoy sets. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

18.
Interresidue protein contacts in proteins structures and at protein-protein interface are classically described by the amino acid types of interacting residues and the local structural context of the contact, if any, is described using secondary structures. In this study, we present an alternate analysis of interresidue contact using local structures defined by the structural alphabet introduced by Camproux et al. This structural alphabet allows to describe a 3D structure as a sequence of prototype fragments called structural letters, of 27 different types. Each residue can then be assigned to a particular local structure, even in loop regions. The analysis of interresidue contacts within protein structures defined using Vorono? tessellations reveals that pairwise contact specificity is greater in terms of structural letters than amino acids. Using a simple heuristic based on specificity score comparison, we find that 74% of the long-range contacts within protein structures are better described using structural letters than amino acid types. The investigation is extended to a set of protein-protein complexes, showing that the similar global rules apply as for intraprotein contacts, with 64% of the interprotein contacts best described by local structures. We then present an evaluation of pairing functions integrating structural letters to decoy scoring and show that some complexes could benefit from the use of structural letter-based pairing functions.  相似文献   

19.
Important properties of globular proteins, such as the stability of its folded state, depend sensitively on interactions with solvent molecules. Existing methods for estimating these interactions, such as the geometrical surface model, are either physically misleading or too time consuming to be applied routinely in energy calculations. As an alternative, we derive here a simple model for the interactions between protein atoms and solvent atoms in the first hydration layer, the solvent contact model, based on the conservation of the total number of atomic contacts, a consequence of the excluded-volume effect. The model has the conceptual advantage that protein-protein contacts and protein-solvent contacts are treated in the same language and the technical advantage that the solvent term becomes a particularly simple function of interatomic distances. The model allows rapid calculation of any physical property that depends only on the number and type of protein-solvent nearest-neighbor contacts. We propose use of the method in the calculation of protein solvation energies, conformational energy calculations, and molecular dynamics simulations.  相似文献   

20.
About 6000 contact regions (patches) of helix-to-helix packing from 300 well-resolved non-homologous protein structures were considered. The patches were defined by the spatial helical neighbors and were estimated in atomic detail using a variable distance criterion. The following questions are addressed. (1) Are the amino acid preferences and atomic composition of distinct types of helical patches indicative for the type of their neighbor? Distributions of size, atomic composition and packing density are compared for different types of helical interfaces. Thereby contact preferences are derived for parts of secondary structures adjoining each other or pointing towards the solvent. (2) Is it possible to cluster helical patches according to their structural similarity? For these purposes the patches were classified with an automatic sequence-independent superposition procedure which yields a distinctively reduced set of representative interfaces. On this basis, the methodology for finding exchangeable patches in different proteins is demonstrated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号