共查询到20条相似文献,搜索用时 15 毫秒
1.
We describe an effective procedure for modeling the structures of simple transmembrane helix homo-oligomers. The method differs from many previous approaches in that the only structural constraint we use to help select the correct model is the oligomerization state of the protein. The method involves the following steps: (1) perform 100-250 independent Monte Carlo energy minimizations of helix pairs to produce a large collection of well-packed structures; (2) filter the minimized structures to find those that are consistent with the expected symmetry of the oligomer; (3) cluster the structures that pass the symmetry filter; and (4) select a representative of the most populous cluster as the final prediction. We applied the method to the transmembrane helices of five proteins and compare our results to the available experimental data. Our predictions of glycophorin A, neu, the M2 channel and phospholamban resulted in a single model for each protein that agreed with the experimental results. In the case of erbB-2, however, we obtained three structurally distinct clusters of approximately equal sizes, so it was not possible to identify a clearly favored structure. This may reflect a real heterogeneity of packing modes for erbB-2, which is known to interact with different receptor subunits. Our method should be useful for obtaining structural models of transmembrane domains, improving our understanding of structure/function relationships for particular membrane proteins. 相似文献
2.
Pairs of helices in transmembrane (TM) proteins are often tightly packed. We present a scoring function and a computational methodology for predicting the tertiary fold of a pair of alpha-helices such that its chances of being tightly packed are maximized. Since the number of TM protein structures solved to date is small, it seems unlikely that a reliable scoring function derived statistically from the known set of TM protein structures will be available in the near future. We therefore constructed a scoring function based on the qualitative insights gained in the past two decades from the solved structures of TM and soluble proteins. In brief, we reward the formation of contacts between small amino acid residues such as Gly, Cys, and Ser, that are known to promote dimerization of helices, and penalize the burial of large amino acid residues such as Arg and Trp. As a case study, we show that our method predicts the native structure of the TM homodimer glycophorin A (GpA) to be, in essence, at the global score optimum. In addition, by correlating our results with empirical point mutations on this homodimer, we demonstrate that our method can be a helpful adjunct to mutation analysis. We present a data set of canonical alpha-helices from the solved structures of TM proteins and provide a set of programs for analyzing it (http://ashtoret.tau.ac.il/~sarel). From this data set we derived 11 helix pairs, and conducted searches around their native states as a further test of our method. Approximately 73% of our predictions showed a reasonable fit (RMS deviation <2A) with the native structures compared to the success rate of 8% expected by chance. The search method we employ is less effective for helix pairs that are connected via short loops (<20 amino acid residues), indicating that short loops may play an important role in determining the conformation of alpha-helices in TM proteins. 相似文献
3.
The transmembrane (TM) domains of most membrane proteins consist of helix bundles. The seemingly simple task of TM helix bundle assembly has turned out to be extremely difficult. This is true even for simple TM helix bundle proteins, i.e., those that have the simple form of compact TM helix bundles. Herein, we present a computational method that is capable of generating native-like structural models for simple TM helix bundle proteins having modest numbers of TM helices based on sequence conservation patterns. Thus, the only requirement for our method is the presence of more than 30 homologous sequences for an accurate extraction of sequence conservation patterns. The prediction method first computes a number of representative well-packed conformations for each pair of contacting TM helices, and then a library of tertiary folds is generated by overlaying overlapping TM helices of the representative conformations. This library is scored using sequence conservation patterns, and a subsequent clustering analysis yields five final models. Assuming that neighboring TM helices in the sequence contact each other (but not that TM helices A and G contact each other), the method produced structural models of Calpha atom root-mean-square deviation (CA RMSD) of 3-5 A from corresponding crystal structures for bacteriorhodopsin, halorhodopsin, sensory rhodopsin II, and rhodopsin. In blind predictions, this type of contact knowledge is not available. Mimicking this, predictions were made for the rotor of the V-type Na(+)-adenosine triphosphatase without such knowledge. The CA RMSD between the best model and its crystal structure is only 3.4 A, and its contact accuracy reaches 55%. Furthermore, the model correctly identifies the binding pocket for sodium ion. These results demonstrate that the method can be readily applied to ab initio structure prediction of simple TM helix bundle proteins having modest numbers of TM helices. 相似文献
4.
The prediction of the secondary structure of proteins from their amino acid sequences remains a key component of many approaches to the protein folding problem. The most abundant form of regular secondary structure in proteins is the alpha-helix, in which specific residue preferences exist at the N-terminal locations. Propensities derived from these observed amino acid frequencies in the Protein Data Bank (PDB) database correlate well with experimental free energies measured for residues at different N-terminal positions in alanine-based peptides. We report a novel method to exploit this data to improve protein secondary structure prediction through identification of the correct N-terminal sequences in alpha-helices, based on existing popular methods for secondary structure prediction. With this algorithm, the number of correctly predicted alpha-helix start positions was improved from 30% to 38%, while the overall prediction accuracy (Q3) remained the same, using cross-validated testing. Although the algorithm was developed and tested on multiple sequence alignment-based secondary structure predictions, it was also able to improve the predictions of start locations by methods that use single sequences to make their predictions. Furthermore, the residue frequencies at N-terminal positions of the improved predictions better reflect those seen at the N-terminal positions of alpha-helices in proteins. This has implications for areas such as comparative modeling, where a more accurate prediction of the N-terminal regions of alpha-helices should benefit attempts to model adjacent loop regions. The algorithm is available as a Web tool, located at http://rocky.bms.umist.ac.uk/elephant. 相似文献
5.
6.
Transmembrane proteins (TMPs) are important drug targets because they are essential for signaling, regulation, and transport. Despite important breakthroughs, experimental structure determination remains challenging for TMPs. Various methods have bridged the gap by predicting transmembrane helices (TMHs), but room for improvement remains. Here, we present TMSEG, a novel method identifying TMPs and accurately predicting their TMHs and their topology. The method combines machine learning with empirical filters. Testing it on a non‐redundant dataset of 41 TMPs and 285 soluble proteins, and applying strict performance measures, TMSEG outperformed the state‐of‐the‐art in our hands. TMSEG correctly distinguished helical TMPs from other proteins with a sensitivity of 98 ± 2% and a false positive rate as low as 3 ± 1%. Individual TMHs were predicted with a precision of 87 ± 3% and recall of 84 ± 3%. Furthermore, in 63 ± 6% of helical TMPs the placement of all TMHs and their inside/outside topology was correctly predicted. There are two main features that distinguish TMSEG from other methods. First, the errors in finding all helical TMPs in an organism are significantly reduced. For example, in human this leads to 200 and 1600 fewer misclassifications compared to the second and third best method available, and 4400 fewer mistakes than by a simple hydrophobicity‐based method. Second, TMSEG provides an add‐on improvement for any existing method to benefit from. Proteins 2016; 84:1706–1716. © 2016 Wiley Periodicals, Inc. 相似文献
7.
Gottschalk KE Adams PD Brunger AT Kessler H 《Protein science : a publication of the Protein Society》2002,11(7):1800-1812
Integrins are composed of noncovalently bound dimers of an alpha- and a beta-subunit. They play an important role in cell-matrix adhesion and signal transduction through the cell membrane. Signal transduction can be initiated by the binding of intracellular proteins to the integrin. Binding leads to a major conformational change. The change is passed on to the extracellular domain through the membrane. The affinity of the extracellular domain to certain ligands increases; thus at least two states exist, a low-affinity and a high-affinity state. The conformations and conformational changes of the transmembrane (TM) domain are the focus of our interest. We show by a global search of helix-helix interactions that the TM section of the family of integrins are capable of adopting a structure similar to the structure of the homodimeric TM protein Glycophorin A. For the alpha(IIb)beta(3) integrin, this structural motif represents the high-affinity state. A second conformation of the TM domain of alpha(IIb)beta(3) is identified as the low-affinity state by known mutational and nuclear magnetic resonance (NMR) studies. A transition between these two states was determined by molecular dynamics (MD) calculations. On the basis of these calculations, we propose a three-state mechanism. 相似文献
8.
Dimerization models of c-erbB2 transmembrane domains (Leu651-Ile675) are studied by molecular mechanics and molecular dynamics simulations. Both wild and Glu mutated transmembrane helices exhibit the same relative orientation for favorable associations and dimerize preferentially in left-handed coiled-coil structures. The mutation point 659 belongs to the interfacing residues, and in the transforming domain, symmetric hydrogen bonds between Glu carboxylic groups stabilize the dimeric structure. The same helix packing found for the wild dimers, except side-chain—side-chain hydrogen bonds, suggests that the transmembrane domains dimerize according to similar process. Structural and energetical characterization of the models are presented. © 1997 John Wiley & Sons, Inc. Biopoly 42: 157–168, 1997 相似文献
9.
Optimizing weighting factors for a linear combination of terms in a scoring function is a crucial step for success in developing a threading algorithm. Usually weighting factors are optimized to yield the highest success rate on a training dataset, and the determined constant values for the weighting factors are used for any target sequence. Here we explore completely different approaches to handle weighting factors for a scoring function of threading. Throughout this study we use a model system of gapless threading using a scoring function with two terms combined by a weighting factor, a main chain angle potential and a residue contact potential. First, we demonstrate that the optimal weighting factor for recognizing the native structure differs from target sequence to target sequence. Then, we present three novel threading methods which circumvent training dataset-based weighting factor optimization. The basic idea of the three methods is to employ different weighting factor values and finally select a template structure for a target sequence by examining characteristics of the distribution of scores computed by using the different weighting factor values. Interestingly, the success rate of our approaches is comparable to the conventional threading method where the weighting factor is optimized based on a training dataset. Moreover, when the size of the training set available for the conventional threading method is small, our approach often performs better. In addition, we predict a target-specific weighting factor optimal for a target sequence by an artificial neural network from features of the target sequence. Finally, we show that our novel methods can be used to assess the confidence of prediction of a conventional threading with an optimized constant weighting factor by considering consensus prediction between them. Implication to the underlined energy landscape of protein folding is discussed. 相似文献
10.
Lazaridis T 《Proteins》2003,52(2):176-192
A simple extension of the EEF1 energy function to heterogeneous membrane-aqueous media is proposed. The extension consists of (a) development of solvation parameters for a nonpolar phase using experimental data for the transfer of amino acid side-chains from water to cyclohexane, (b) introduction of a heterogeneous membrane-aqueous system by making the reference solvation free energy of each atom dependent on the vertical coordinate, (c) a modification of the distance-dependent dielectric model to account for reduced screening of electrostatic interactions in the membrane, and (d) an adjustment of the EEF1 aqueous model in light of recent calculations of the potential of mean force between amino acid side-chains in water. The electrostatic model is adjusted to match experimental observations for polyalanine, polyleucine, and the glycophorin A dimer. The resulting energy function (IMM1) reproduces the preference of Trp and Tyr for the membrane interface, gives reasonable energies of insertion into or adsorption onto a membrane, and allows stable 1-ns MD simulations of the glycophorin A dimer. We find that the lowest-energy orientation of melittin in bilayers varies, depending on the thickness of the hydrocarbon layer. 相似文献
11.
Gang Xu Tianqi Ma Qinghua Wang Jianpeng Ma 《Protein science : a publication of the Protein Society》2019,28(6):1157-1162
We introduce a side‐chain‐inclusive scoring function, named OPUS‐SSF, for ranking protein structural models. The method builds a scoring function based on the native distributions of the coordinate components of certain anchoring points in a local molecular system for peptide segments of 5, 7, 9, and 11 residues in length. Differing from our previous OPUS‐CSF [Xu et al., Protein Sci. 2018; 27: 286–292], which exclusively uses main chain information, OPUS‐SSF employs anchoring points on side chains so that the effect of side chains is taken into account. The performance of OPUS‐SSF was tested on 15 decoy sets containing totally 603 proteins, and 571 of them had their native structures recognized from their decoys. Similar to OPUS‐CSF, OPUS‐SSF does not employ the Boltzmann formula in constructing scoring functions. The results indicate that OPUS‐SSF has achieved a significant improvement on decoy recognition and it should be a very useful tool for protein structural prediction and modeling. 相似文献
12.
Using an efficient iterative method, we have developed a distance-dependent knowledge-based scoring function to predict protein-protein interactions. The function, referred to as ITScore-PP, was derived using the crystal structures of a training set of 851 protein-protein dimeric complexes containing true biological interfaces. The key idea of the iterative method for deriving ITScore-PP is to improve the interatomic pair potentials by iteration, until the pair potentials can distinguish true binding modes from decoy modes for the protein-protein complexes in the training set. The iterative method circumvents the challenging reference state problem in deriving knowledge-based potentials. The derived scoring function was used to evaluate the ligand orientations generated by ZDOCK 2.1 and the native ligand structures on a diverse set of 91 protein-protein complexes. For the bound test cases, ITScore-PP yielded a success rate of 98.9% if the top 10 ranked orientations were considered. For the more realistic unbound test cases, the corresponding success rate was 40.7%. Furthermore, for faster orientational sampling purpose, several residue-level knowledge-based scoring functions were also derived following the similar iterative procedure. Among them, the scoring function that uses the side-chain center of mass (SCM) to represent a residue, referred to as ITScore-PP(SCM), showed the best performance and yielded success rates of 71.4% and 30.8% for the bound and unbound cases, respectively, when the top 10 orientations were considered. ITScore-PP was further tested using two other published protein-protein docking decoy sets, the ZDOCK decoy set and the RosettaDock decoy set. In addition to binding mode prediction, the binding scores predicted by ITScore-PP also correlated well with the experimentally determined binding affinities, yielding a correlation coefficient of R = 0.71 on a test set of 74 protein-protein complexes with known affinities. ITScore-PP is computationally efficient. The average run time for ITScore-PP was about 0.03 second per orientation (including optimization) on a personal computer with 3.2 GHz Pentium IV CPU and 3.0 GB RAM. The computational speed of ITScore-PP(SCM) is about an order of magnitude faster than that of ITScore-PP. ITScore-PP and/or ITScore-PP(SCM) can be combined with efficient protein docking software to study protein-protein recognition. 相似文献
13.
Transmembrane beta-barrel (TMB) proteins are embedded in the outer membrane of gram-negative bacteria, mitochondria, and chloroplasts. Despite their importance, very few nonhomologous TMB structures have been determined by X-ray diffraction because of the experimental difficulty encountered in crystallizing transmembrane proteins. We introduce the program partiFold to investigate the folding landscape of TMBs. By computing the Boltzmann partition function, partiFold estimates inter-beta-strand residue interaction probabilities, predicts contacts and per-residue X-ray crystal structure B-values, and samples conformations from the Boltzmann low energy ensemble. This broad range of predictive capabilities is achieved using a single, parameterizable grammatical model to describe potential beta-barrel supersecondary structures, combined with a novel energy function of stacked amino acid pair statistical potentials. PartiFold outperforms existing programs for inter-beta-strand residue contact prediction on TMB proteins, offering both higher average predictive accuracy as well as more consistent results. Moreover, the integration of these contact probabilities inside a stochastic contact map can be used to infer a more meaningful picture of the TMB folding landscape, which cannot be achieved with other methods. Partifold's predictions of B-values are competitive with recent methods specifically designed for this problem. Finally, we show that sampling TMBs from the Boltzmann ensemble matches the X-ray crystal structure better than single structure prediction methods. A webserver running partiFold is available at http://partiFold.csail.mit.edu/. 相似文献
14.
The protein structures of six comparative modeling targets were predicted in a procedure that relied on improved energy minimization, without empirical rules, to position all new atoms. The structures of human nucleoside diphosphate kinase NM23-H2, HPr from Mycoplasma capricolum, 2Fe-2S ferredoxin from Haloarcula marismortui, eosinophil-derived neurotoxin (EDN), mouse cellular retinoic acid protein I (CRABP1), and P450eryf were predicted with root mean square deviations on Cα atoms of 0.69, 0.73, 1.11, 1.48, 1.69, and 1.73 Å, respectively, compared to the target crystal structures. These differences increased as the sequence similarity between the target and parent proteins decreased from about 60 to 20% identity. More residues were predicted than form the common region shared by the two crystal structures. In most cases insertions or deletions between the target and the related protein of known structure were not correctly positioned. One two residue insertion in CRABP1 was predicted in the correct conformation, while a nine residue insertion in EDN was predicted in the correct spatial region, although not in the correct conformation. The positions of common cofactors and their binding sites were predicted correctly, even when overall sequence similarity was low. © 1995 Wiley-Liss, Inc. 相似文献
15.
Metal ions are crucial for protein function. They participate in enzyme catalysis, play regulatory roles, and help maintain protein structure. Current tools for predicting metal-protein interactions are based on proteins crystallized with their metal ions present (holo forms). However, a majority of resolved structures are free of metal ions (apo forms). Moreover, metal binding is a dynamic process, often involving conformational rearrangement of the binding pocket. Thus, effective predictions need to be based on the structure of the apo state. Here, we report an approach that identifies transition metal-binding sites in apo forms with a resulting selectivity >95%. Applying the approach to apo forms in the Protein Data Bank and structural genomics initiative identifies a large number of previously unknown, putative metal-binding sites, and their amino acid residues, in some cases providing a first clue to the function of the protein. 相似文献
16.
Tristan I. Croll Massimo D. Sammito Andriy Kryshtafovych Randy J. Read 《Proteins》2019,87(12):1113-1127
Performance in the template-based modeling (TBM) category of CASP13 is assessed here, using a variety of metrics. Performance of the predictor groups that participated is ranked using the primary ranking score that was developed by the assessors for CASP12. This reveals that the best results are obtained by groups that include contact predictions or inter-residue distance predictions derived from deep multiple sequence alignments. In cases where there is a good homolog in the wwPDB (TBM-easy category), the best results are obtained by modifying a template. However, for cases with poorer homologs (TBM-hard), very good results can be obtained without using an explicit template, by deep learning algorithms trained on the wwPDB. Alternative metrics are introduced, to allow testing of aspects of structural models that are not addressed by traditional CASP metrics. These include comparisons to the main-chain and side-chain torsion angles of the target, and the utility of models for solving crystal structures by the molecular replacement method. The alternative metrics are poorly correlated with the traditional metrics, and it is proposed that modeling has reached a sufficient level of maturity that the best models should be expected to satisfy this wider range of criteria. 相似文献
17.
A tertiary structure model of the Abl-SH3 domain is predicted by using homology modeling techniques coupled to molecular dynamics simulations. Two template proteins were used, Fyn-SH3 and Spc-SH3. The refined model was extensively checked for errors using criteria based on stereochemistry, packing, solvation free-energy, accessible surface areas, and contact analyses. The different checking methods do not totally agree, as each one evaluates a different characteristic of protein structures. Several zones of the protein are more susceptible to incorporating errors. These include residues 13, 15, 35, 39, 45, 46, 50, and 60. An interesting finding is that the measurement of the Cα chirality correlated well with the rest of the criteria, suggesting that this parameter might be a good indicator of correct local conformation. Deviations of more than 4 degrees may be indicative of poor local structure. © 1994 Wiley-Liss, Inc. 相似文献
18.
We perform a systematic examination of the ability of several different high-resolution, atomic-detail scoring functions to discriminate native conformations of loops in membrane proteins from non-native but physically reasonable, or "decoy," conformations. Decoys constructed from changing a loop conformation while keeping the remainder of the protein fixed are a challenging test of energy function accuracy. Nevertheless, the best of the energy functions we examined recognized the native structure as lowest in energy around half the time, and consistently chose it as a low-energy structure. This suggests that the best of present energy functions, even without a representation of the lipid bilayer, are of sufficient accuracy to give reasonable confidence in predictions of membrane protein structure. We also constructed homology models for each structure, using other known structures in the same protein family as templates. Homology models were constructed using several scoring functions and modeling programs, but with a comparable sampling effort for each procedure. Our results indicate that the quality of sequence alignment is probably the most important factor in model accuracy for sequence identity from 20-40%; one can expect a reasonably accurate model for membrane proteins when sequence identity is greater than 30%, in agreement with previous studies. Most errors are localized in loop regions, which tend to be found outside the lipid bilayer. For the most discriminative energy functions, it appears that errors are most likely due to lack of sufficient sampling, although it should be stressed that present energy functions are still far from perfectly reliable. 相似文献
19.
The use of classical molecular dynamics simulations, performed in explicit water, for the refinement of structural models of proteins generated ab initio or based on homology has been investigated. The study involved a test set of 15 proteins that were previously used by Baker and coworkers to assess the efficiency of the ROSETTA method for ab initio protein structure prediction. For each protein, four models generated using the ROSETTA procedure were simulated for periods of between 5 and 400 nsec in explicit solvent, under identical conditions. In addition, the experimentally determined structure and the experimentally derived structure in which the side chains of all residues had been deleted and then regenerated using the WHATIF program were simulated and used as controls. A significant improvement in the deviation of the model structures from the experimentally determined structures was observed in several cases. In addition, it was found that in certain cases in which the experimental structure deviated rapidly from the initial structure in the simulations, indicating internal strain, the structures were more stable after regenerating the side-chain positions. Overall, the results indicate that molecular dynamics simulations on a tens to hundreds of nanoseconds time scale are useful for the refinement of homology or ab initio models of small to medium-size proteins. 相似文献
20.
We developed a method for structure characterization of assembly components by iterative comparative protein structure modeling and fitting into cryo-electron microscopy (cryoEM) density maps. Specifically, we calculate a comparative model of a given component by considering many alternative alignments between the target sequence and a related template structure while optimizing the fit of a model into the corresponding density map. The method relies on the previously developed Moulder protocol that iterates over alignment, model building, and model assessment. The protocol was benchmarked using 20 varied target-template pairs of known structures with less than 30% sequence identity and corresponding simulated density maps at resolutions from 5A to 25A. Relative to the models based on the best existing sequence profile alignment methods, the percentage of C(alpha) atoms that are within 5A of the corresponding C(alpha) atoms in the superposed native structure increases on average from 52% to 66%, which is half-way between the starting models and the models from the best possible alignments (82%). The test also reveals that despite the improvements in the accuracy of the fitness function, this function is still the bottleneck in reducing the remaining errors. To demonstrate the usefulness of the protocol, we applied it to the upper domain of the P8 capsid protein of rice dwarf virus that has been studied by cryoEM at 6.8A. The C(alpha) root-mean-square deviation of the model based on the remotely related template, bluetongue virus VP7, improved from 8.7A to 6.0A, while the best possible model has a C(alpha) RMSD value of 5.3A. Moreover, the resulting model fits better into the cryoEM density map than the initial template structure. The method is being implemented in our program MODELLER for protein structure modeling by satisfaction of spatial restraints and will be applicable to the rapidly increasing number of cryoEM density maps of macromolecular assemblies. 相似文献