首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Hyungrae Kim  Daisuke Kihara 《Proteins》2014,82(12):3255-3272
We developed a new representation of local amino acid environments in protein structures called the Side‐chain Depth Environment (SDE). An SDE defines a local structural environment of a residue considering the coordinates and the depth of amino acids that locate in the vicinity of the side‐chain centroid of the residue. SDEs are general enough that similar SDEs are found in protein structures with globally different folds. Using SDEs, we developed a procedure called PRESCO (Protein Residue Environment SCOre) for selecting native or near‐native models from a pool of computational models. The procedure searches similar residue environments observed in a query model against a set of representative native protein structures to quantify how native‐like SDEs in the model are. When benchmarked on commonly used computational model datasets, our PRESCO compared favorably with the other existing scoring functions in selecting native and near‐native models. Proteins 2014; 82:3255–3272. © 2014 Wiley Periodicals, Inc.  相似文献   

2.
Simplified force fields play an important role in protein structure prediction and de novo protein design by requiring less computational effort than detailed atomistic potentials. A side chain centroid based, distance dependent pairwise interaction potential has been developed. A linear programming based formulation was used in which non-native "decoy" conformers are forced to take a higher energy compared with the corresponding native structure. This model was trained on an enhanced and diverse protein set. High quality decoy structures were generated for approximately 1400 nonhomologous proteins using torsion angle dynamics along with restricted variations of the hydrophobic cores of the native structure. The resulting decoy set was used to train the model yielding two different side chain centroid based force fields that differ in the way distance dependence has been used to calculate energy parameters. These force fields were tested on an independent set of 148 test proteins with 500 decoy structures for each protein. The side chain centroid force fields were successful in correctly identifying approximately 86% native structures. The Z-scores produced by the proposed centroid-centroid distance dependent force fields improved compared with other distance dependent C(alpha)-C(alpha) or side chain based force fields.  相似文献   

3.
Structural uniqueness is characteristic of native proteins and is essential to express their biological functions. The major factors that bring about the uniqueness are specific interactions between hydrophobic residues and their unique packing in the protein core. To find the origin of the uniqueness in their amino acid sequences, we analyzed the distribution of the side chain rotational isomers (rotamers) of hydrophobic amino acids in protein tertiary structures and derived deltaS(contact), the conformational-entropy changes of side chains by residue-residue contacts in each secondary structure. The deltaS(contact) values indicate distinct tendencies of the residue pairs to restrict side chain conformation by inter-residue contacts. Of the hydrophobic residues in alpha-helices, aliphatic residues (Leu, Val, Ile) strongly restrict the side chain conformations of each other. In beta-sheets, Met is most strongly restricted by contact with Ile, whereas Leu, Val and Ile are less affected by other residues in contact than those in alpha-helices. In designed and native protein variants, deltaS(contact) was found to correlate with the folding-unfolding cooperativity. Thus, it can be used as a specificity parameter for designing artificial proteins with a unique structure.  相似文献   

4.
Predicting protein structure from primary sequence is one of the ultimate challenges in computational biology. Given the large amount of available sequence data, the analysis of co-evolution, i.e., statistical dependency, between columns in multiple alignments of protein domain sequences remains one of the most promising avenues for predicting residues that are contacting in the structure. A key impediment to this approach is that strong statistical dependencies are also observed for many residue pairs that are distal in the structure. Using a comprehensive analysis of protein domains with available three-dimensional structures we show that co-evolving contacts very commonly form chains that percolate through the protein structure, inducing indirect statistical dependencies between many distal pairs of residues. We characterize the distributions of length and spatial distance traveled by these co-evolving contact chains and show that they explain a large fraction of observed statistical dependencies between structurally distal pairs. We adapt a recently developed Bayesian network model into a rigorous procedure for disentangling direct from indirect statistical dependencies, and we demonstrate that this method not only successfully accomplishes this task, but also allows contacts with weak statistical dependency to be detected. To illustrate how additional information can be incorporated into our method, we incorporate a phylogenetic correction, and we develop an informative prior that takes into account that the probability for a pair of residues to contact depends strongly on their primary-sequence distance and the amount of conservation that the corresponding columns in the multiple alignment exhibit. We show that our model including these extensions dramatically improves the accuracy of contact prediction from multiple sequence alignments.  相似文献   

5.
We consider the problem of identifying common three-dimensional substructures between proteins. Our method is based on comparing the shape of the alpha-carbon backbone structures of the proteins in order to find three-dimensional (3D) rigid motions that bring portions of the geometric structures into correspondence. We propose a geometric representation of protein backbone chains that is compact yet allows for similarity measures that are robust against noise and outliers. This representation encodes the structure of the backbone as a sequence of unit vectors, defined by each adjacent pair of alpha-carbons. We then define a measure of the similarity of two protein structures based on the root mean squared (RMS) distance between corresponding orientation vectors of the two proteins. Our measure has several advantages over measures that are commonly used for comparing protein shapes, such as the minimum RMS distance between the 3D positions of corresponding atoms in two proteins. A key advantage is that this new measure behaves well for identifying common substructures, in contrast with position-based measures where the nonmatching portions of the structure dominate the measure. At the same time, it avoids the quadratic space and computational difficulties associated with methods based on distance matrices and contact maps. We show applications of our approach to detecting common contiguous substructures in pairs of proteins, as well as the more difficult problem of identifying common protein domains (i.e., larger substructures that are not necessarily contiguous along the protein chain).  相似文献   

6.
4-Amino-4-deoxychorismate lyase (ADCL) is a member of the fold-type IV of PLP dependent enzymes that converts 4-amino-4-deoxychorismate (ADC) to p-aminobenzoate and pyruvate. The crystal structure of ADCL from Escherichia coli has been solved using MIR phases in combination with density modification. The structure has been refined to an R-factor of 20.6% at 2.2 A resolution. The enzyme is a homo dimer with a crystallographic twofold axis, and the polypeptide chain is folded into small and large domains with an interdomain loop. The coenzyme, pyridoxal 5'-phosphate, resides at the domain interface, its re-face facing toward the protein. Although the main chain folding of the active site is homologous to those of D-amino acid and L-branched-chain amino acid aminotransferases, no residues in the active site are conserved among them except for Arg59, Lys159, and Glu193, which directly interact with the coenzyme and play critical roles in the catalytic functions. ADC was modeled into the active site of the unliganded enzyme on the basis of the X-ray structures of the unliganded and liganded forms in the D-amino acid and L-branched-chain amino acid aminotransferases. According to this model, the carboxylates of ADC are recognized by Asn256, Arg107, and Lys97, and the cyclohexadiene moiety makes van der Waals contact with the side chain of Leu258. ADC forms a Schiff base with PLP to release the catalytic residue Lys159, which forms a hydrogen bond with Thr38. The neutral amino group of Lys159 eliminates the a-proton of ADC to give a quinonoid intermediate to release a pyruvate in accord with the proton transfer from Thr38 to the olefin moiety of ADC.  相似文献   

7.
8.
Magnetic dipolar interactions between pairs of solvent-exposed nitroxide side chains separated by approximately one to four turns along an alpha-helix in T4 lysozyme are investigated. The interactions are analyzed both in frozen solution (rigid lattice conditions) and at room temperature as a function of solvent viscosity. At room temperature, a novel side chain with hindered internal motion is used, along with a more commonly employed nitroxide side chain. The results suggest that methods developed for rigid lattice conditions can be used to analyze dipolar interactions between nitroxides even in the presence of motion of the individual spins, provided the rotational correlation time of the interspin vector is sufficiently long. The distribution of distances observed for the various spin pairs is consistent with rotameric equilibria in the nitroxide side chain, as observed in crystal structures. The existence of such distance distributions places important constraints on the interpretation of internitroxide distances in terms of protein structure and structural changes.  相似文献   

9.
Protein structure alignment is an important tool in many biological applications, such as protein evolution studies, protein structure modeling, and structure-based, computer-aided drug design. Protein structure alignment is also one of the most challenging problems in computational molecular biology, due to an infinite number of possible spatial orientations of any two protein structures. We study one of the most commonly used measures of pairwise protein structure similarity, defined as the number of pairs of atoms in two proteins that can be superimposed under a predefined distance cutoff. We prove that the expected running time of a recently published algorithm for optimizing this (and some other, derived measures of protein structure similarity) is polynomial.  相似文献   

10.
We introduce a new algorithm, IRECS (Iterative REduction of Conformational Space), for identifying ensembles of most probable side-chain conformations for homology modeling. On the basis of a given rotamer library, IRECS ranks all side-chain rotamers of a protein according to the probability with which each side chain adopts the respective rotamer conformation. This ranking enables the user to select small rotamer sets that are most likely to contain a near-native rotamer for each side chain. IRECS can therefore act as a fast heuristic alternative to the Dead-End-Elimination algorithm (DEE). In contrast to DEE, IRECS allows for the selection of rotamer subsets of arbitrary size, thus being able to define structure ensembles for a protein. We show that the selection of more than one rotamer per side chain is generally meaningful, since the selected rotamers represent the conformational space of flexible side chains. A knowledge-based statistical potential ROTA was constructed for the IRECS algorithm. The potential was optimized to discriminate between side-chain conformations of native and rotameric decoys of protein structures. By restricting the number of rotamers per side chain to one, IRECS can optimize side chains for a single conformation model. The average accuracy of IRECS for the chi1 and chi1+2 dihedral angles amounts to 84.7% and 71.6%, respectively, using a 40 degrees cutoff. When we compared IRECS with SCWRL and SCAP, the performance of IRECS was comparable to that of both methods. IRECS and the ROTA potential are available for download from the URL http://irecs.bioinf.mpi-inf.mpg.de.  相似文献   

11.
We develop a computationally efficient method to simulate the transition of a protein between two conformations. Our method is based on a coarse-grained elastic network model in which distances between spatially proximal amino acids are interpolated between the values specified by the two end conformations. The computational speed of this method depends strongly on the choice of cutoff distance used to define interactions as measured by the density of entries of the constant linking/contact matrix. To circumvent this problem we introduce the concept of using a cutoff based on a maximum number of nearest neighbors. This generates linking matrices that are both sparse and uniform, hence allowing for efficient computations that are independent of the arbitrariness of cutoff distance choices. Simulation results demonstrate that the method developed here reliably generates feasible intermediate conformations, because our method observes steric constraints and produces monotonic changes in virtual bond and torsion angles. Applications are readily made to large proteins, and we demonstrate our method on lactate dehydrogenase, citrate synthase, and lactoferrin. We also illustrate how this framework can be used to complement experimental techniques that partially observe protein motions.  相似文献   

12.
In the native folded conformation of a globular protein, amino acid residues distant along the polypeptide chain come together to form the compact structure. This spatial structure is such that most of the polar residues are on the surface and have contact with the solvent medium and the nonpolar residues buried in the interior which have contact with similar nonpolar side chains. This cooperativity and mutual interaction among the randomly aligned amino acid residues suggest that each type of residue may prefer to have a specific environment. To gain more insight into this aspect of residue-residue cooperativity, a detailed analysis of the preferred environment associated with each of the 20 different amino acid residues in a number of protein crystals has been carried out. The variation of nonpolar nature computed for different sizes of spheres shows that the spatial region between radii of 6 and 8 Å is more favored for hydrophobic interactions and indicates that the influence of each residue over the surrounding medium extends predominantly up to a distance of 8 Å. The analysis of the surrounding amino acid residues associated with each type of residue shows that there is a definite tendency for each type of residue to have association with specific residues. The variation in environment is found even within the polar group as well as in the nonpolar group of residues. The surrounding residues associated with isoleucine, leucine, and valine are purely nonpolar. Proline, a nonpolar residue, is often surrounded by polar residues. The surrounding nonpolar nature of the tryptophan and tyrosine residues implies that even a single polar atom in a nonpolar side chain is sufficient to reduce their hydrophobic environment. There exists a high degree of mutual residue-residue cooperativity between the pairs glutamic acid-lysine, methionine-arginine, asparagine-tryptophan, and glutamine-proline, and the mutual residue-residue noncooperativity is high for the pairs methionine-aspartic acid, cysteine-glutamic acid, histidine-glutamine, and leucine-asparagine. The formation of secondary and tertiary structures is discussed in terms of the preferred environment and mutual cooperativity among various types of amino acid residues.  相似文献   

13.
Side chain prediction is an integral component of computational antibody design and structure prediction. Current antibody modelling tools use backbone‐dependent rotamer libraries with conformations taken from general proteins. Here we present our antibody‐specific rotamer library, where rotamers are binned according to their immunogenetics (IMGT) position, rather than their local backbone geometry. We find that for some amino acid types at certain positions, only a restricted number of side chain conformations are ever observed. Using this information, we are able to reduce the breadth of the rotamer sampling space. Based on our rotamer library, we built a side chain predictor, position‐dependent antibody rotamer swapper (PEARS). On a blind test set of 95 antibody model structures, PEARS had the highest average χ1 and accuracy (78.7% and 64.8%) compared to three leading backbone‐dependent side chain predictors. Our use of IMGT position, rather than backbone ϕ/ψ, meant that PEARS was more robust to errors in the backbone of the model structure. PEARS also achieved the lowest number of side chain–side chain clashes. PEARS is freely available as a web application at http://opig.stats.ox.ac.uk/webapps/pears .  相似文献   

14.
The relations of the binding free energies in a dataset of 69 protein complexes with the numbers of interfacial atom pairs, as well as with the atomic distances of the pairs, are analyzed. It is found that the interfacial main-chain atom pairs contribute more to the correlation than the interfacial side chain atom pairs do, and the polar atom pairs contribute more than the non-polar atom pairs do. Interfacial atom pairs with atomic distance in the range of 6-12 A are the most important to explain the differences in binding free energies in the datasets.  相似文献   

15.
A similarity between average distance maps (Kikuchiet al., 1988a)—that is, predicted contact maps of two tertiary structurally homologous proteins—is examined. Comparisons of shapes of average distance maps (we refer to this as ADM) are made by superpositions of ADMs for two homologous proteins. Also, we compare shapes of actual contact maps for the pair of proteins. We search a optimal superposition mode of each pair of maps showing that two proteins are most similar. It is concluded that two ADMs are also similar when actual tertiary structures between two proteins show similarity. A criterion for similarity of maps is also proposed. The possibility of application of this method to detect weak homology between protein structures is discussed.  相似文献   

16.
We present a solvable model that predicts the folding kinetics of two-state proteins from their native structures. The model is based on conditional chain entropies. It assumes that folding processes are dominated by small-loop closure events that can be inferred from native structures. For CI2, the src SH3 domain, TNfn3, and protein L, the model reproduces two-state kinetics, and it predicts well the average Phi-values for secondary structures. The barrier to folding is the formation of predominantly local structures such as helices and hairpins, which are needed to bring nonlocal pairs of amino acids into contact.  相似文献   

17.
Sink H  Rehm EJ  Richstone L  Bulls YM  Goodman CS 《Cell》2001,105(1):57-67
At specific choice points in the periphery, subsets of motor axons defasciculate from other axons in the motor nerves and steer into their muscle target regions. Using a large-scale genetic screen in Drosophila, we identified the sidestep (side) gene as essential for motor axons to leave the motor nerves and enter their muscle targets. side encodes a target-derived transmembrane protein (Side) that is a novel member of the immunoglobulin superfamily (IgSF). Side is expressed on embryonic muscles during the period when motor axons leave their nerves and extend onto these muscles. In side mutant embryos, motor axons fail to extend onto muscles and instead continue to extend along their motor nerves. Ectopic expression of Side results in extensive and prolonged motor axon contact with inappropriate tissues expressing Side.  相似文献   

18.
Li X  Hu C  Liang J 《Proteins》2003,53(4):792-805
Protein representation and potential function are two important ingredients for studying protein folding, equilibrium thermodynamics, and sequence design. We introduce a novel geometric representation of protein contact interactions using the edge simplices from the alpha shape of the protein structure. This representation can eliminate implausible neighbors that are not in physical contact, and can avoid spurious contact between two residues when a third residue is between them. We developed statistical alpha contact potential using an odds-ratio model. A studentized bootstrap method was then introduced to assess the 95% confidence intervals for each of the 210 propensity parameters. We found, with confidence, that there is significant long-range propensity (>30 residues apart) for hydrophobic interactions. We tested alpha contact potential for native structure discrimination using several sets of decoy structures, and found that it often performs comparably with atom-based potentials requiring many more parameters. We also show that accurate geometric representation is important, and that alpha contact potential has better performance than potential defined by cutoff distance between geometric centers of side chains. Hierarchical clustering of alpha contact potentials reveals natural grouping of residues. To explore the relationship between shape and physicochemical representations, we tested the minimum alphabet size necessary for native structure discrimination. We found that there is no significant difference in performance of discrimination when alphabet size varies from 7 to 20, if geometry is represented accurately by alpha simplicial edges. This result suggests that the geometry of packing plays an important role, but the specific residue types are often interchangeable.  相似文献   

19.
A parameterized algorithm for protein structure alignment.   总被引:2,自引:0,他引:2  
This paper proposes a parameterized polynomial time approximation scheme (PTAS) for aligning two protein structures, in the case where one protein structure is represented by a contact map graph and the other by a contact map graph or a distance matrix. If the sequential order of alignment is not required, the time complexity is polynomial in the protein size and exponential with respect to two parameters D(u)/D(l) and D(c)/D(l), which usually can be treated as constants. In particular, D(u) is the distance threshold determining if two residues are in contact or not, D(c) is the maximally allowed distance between two matched residues after two proteins are superimposed, and D(l) is the minimum inter-residue distance in a typical protein. This result clearly demonstrates that the computational hardness of the contact map based protein structure alignment problem is related not to protein size but to several parameters modeling the problem. The result is achieved by decomposing the protein structure using tree decomposition and discretizing the rigid-body transformation space. Preliminary experimental results indicate that on a Linux PC, it takes from ten minutes to one hour to align two proteins with approximately 100 residues.  相似文献   

20.
Statistical potentials for fold assessment   总被引:3,自引:0,他引:3       下载免费PDF全文
A protein structure model generally needs to be evaluated to assess whether or not it has the correct fold. To improve fold assessment, four types of a residue-level statistical potential were optimized, including distance-dependent, contact, Phi/Psi dihedral angle, and accessible surface statistical potentials. Approximately 10,000 test models with the correct and incorrect folds were built by automated comparative modeling of protein sequences of known structure. The criterion used to discriminate between the correct and incorrect models was the Z-score of the model energy. The performance of a Z-score was determined as a function of many variables in the derivation and use of the corresponding statistical potential. The performance was measured by the fractions of the correctly and incorrectly assessed test models. The most discriminating combination of any one of the four tested potentials is the sum of the normalized distance-dependent and accessible surface potentials. The distance-dependent potential that is optimal for assessing models of all sizes uses both C(alpha) and C(beta) atoms as interaction centers, distinguishes between all 20 standard residue types, has the distance range of 30 A, and is derived and used by taking into account the sequence separation of the interacting atom pairs. The terms for the sequentially local interactions are significantly less informative than those for the sequentially nonlocal interactions. The accessible surface potential that is optimal for assessing models of all sizes uses C(beta) atoms as interaction centers and distinguishes between all 20 standard residue types. The performance of the tested statistical potentials is not likely to improve significantly with an increase in the number of known protein structures used in their derivation. The parameters of fold assessment whose optimal values vary significantly with model size include the size of the known protein structures used to derive the potential and the distance range of the accessible surface potential. Fold assessment by statistical potentials is most difficult for the very small models. This difficulty presents a challenge to fold assessment in large-scale comparative modeling, which produces many small and incomplete models. The results described in this study provide a basis for an optimal use of statistical potentials in fold assessment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号