首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Huang SY  Zou X 《Proteins》2011,79(9):2648-2661
In this study, we have developed a statistical mechanics-based iterative method to extract statistical atomic interaction potentials from known, nonredundant protein structures. Our method circumvents the long-standing reference state problem in deriving traditional knowledge-based scoring functions, by using rapid iterations through a physical, global convergence function. The rapid convergence of this physics-based method, unlike other parameter optimization methods, warrants the feasibility of deriving distance-dependent, all-atom statistical potentials to keep the scoring accuracy. The derived potentials, referred to as ITScore/Pro, have been validated using three diverse benchmarks: the high-resolution decoy set, the AMBER benchmark decoy set, and the CASP8 decoy set. Significant improvement in performance has been achieved. Finally, comparisons between the potentials of our model and potentials of a knowledge-based scoring function with a randomized reference state have revealed the reason for the better performance of our scoring function, which could provide useful insight into the development of other physical scoring functions. The potentials developed in this study are generally applicable for structural selection in protein structure prediction.  相似文献   

2.
We present the results of applying a novel knowledge-based method (FILM) to the prediction of small membrane protein structures. The basis of the method is the addition of a membrane potential to the energy terms (pairwise, solvation, steric, and hydrogen bonding) of a previously developed ab initio technique for the prediction of tertiary structure of globular proteins (FRAGFOLD). The method is based on the assembly of supersecondary structural fragments taken from a library of highly resolved protein structures using a standard simulated annealing algorithm. The membrane potential has been derived by the statistical analysis of a data set made of 640 transmembrane helices with experimentally defined topology and belonging to 133 proteins extracted from the SWISS-PROT database. Results obtained by applying the method to small membrane proteins of known 3D structure show that the method is able to predict, at a reasonable accuracy level, both the helix topology and the conformations of these proteins.  相似文献   

3.
The folding specificity of proteins can be simulated using simplified structural models and knowledge-based pair-potentials. However, when the same models are used to simulate systems that contain many proteins, large aggregates tend to form. In other words, these models cannot account for the fact that folded, globular proteins are soluble. Here we show that knowledge-based pair-potentials, which include explicitly calculated energy terms between the solvent and each amino acid, enable the simulation of proteins that are much less aggregation-prone in the folded state. Our analysis clarifies why including a solvent term improves the foldability. The aggregation for potentials without water is due to the unrealistically attractive interactions between polar residues, causing artificial clustering. When a water-based potential is used instead, polar residues prefer to interact with water; this leads to designed protein surfaces rich in polar residues and well-defined hydrophobic cores, as observed in real protein structures. We developed a simple knowledge-based method to calculate interactions between the solvent and amino acids. The method provides a starting point for modeling the folding and aggregation of soluble proteins. Analysis of our simple model suggests that inclusion of these solvent terms may also improve off-lattice potentials for protein simulation, design, and structure prediction.  相似文献   

4.
Many proteins are composed of several domains that pack together into a complex tertiary structure. Multidomain proteins can be challenging for protein structure modeling, particularly those for which templates can be found for individual domains but not for the entire sequence. In such cases, homology modeling can generate high quality models of the domains but not for the orientations between domains. Small-angle X-ray scattering (SAXS) reports the structural properties of entire proteins and has the potential for guiding homology modeling of multidomain proteins. In this article, we describe a novel multidomain protein assembly modeling method, SAXSDom that integrates experimental knowledge from SAXS with probabilistic Input-Output Hidden Markov model to assemble the structures of individual domains together. Four SAXS-based scoring functions were developed and tested, and the method was evaluated on multidomain proteins from two public datasets. Incorporation of SAXS information improved the accuracy of domain assembly for 40 out of 46 critical assessment of protein structure prediction multidomain protein targets and 45 out of 73 multidomain protein targets from the ab initio domain assembly dataset. The results demonstrate that SAXS data can provide useful information to improve the accuracy of domain-domain assembly. The source code and tool packages are available at https://github.com/jianlin-cheng/SAXSDom .  相似文献   

5.
6.
Information on relative solvent accessibility (RSA) of amino acid residues in proteins provides valuable clues to the prediction of protein structure and function. A two-stage approach with support vector machines (SVMs) is proposed, where an SVM predictor is introduced to the output of the single-stage SVM approach to take into account the contextual relationships among solvent accessibilities for the prediction. By using the position-specific scoring matrices (PSSMs) generated by PSI-BLAST, the two-stage SVM approach achieves accuracies up to 90.4% and 90.2% on the Manesh data set of 215 protein structures and the RS126 data set of 126 nonhomologous globular proteins, respectively, which are better than the highest published scores on both data sets to date. A Web server for protein RSA prediction using a two-stage SVM method has been developed and is available (http://birc.ntu.edu.sg/~pas0186457/rsa.html).  相似文献   

7.
A refinement protocol based on physics‐based techniques established for water soluble proteins is tested for membrane protein structures. Initial structures were generated by homology modeling and sampled via molecular dynamics simulations in explicit lipid bilayer and aqueous solvent systems. Snapshots from the simulations were selected based on scoring with either knowledge‐based or implicit membrane‐based scoring functions and averaged to obtain refined models. The protocol resulted in consistent and significant refinement of the membrane protein structures similar to the performance of refinement methods for soluble proteins. Refinement success was similar between sampling in the presence of lipid bilayers and aqueous solvent but the presence of lipid bilayers may benefit the improvement of lipid‐facing residues. Scoring with knowledge‐based functions (DFIRE and RWplus) was found to be as good as scoring using implicit membrane‐based scoring functions suggesting that differences in internal packing is more important than orientations relative to the membrane during the refinement of membrane protein homology models.  相似文献   

8.
MOTIVATION: As protein structure database expands, protein loop modeling remains an important and yet challenging problem. Knowledge-based protein loop prediction methods have met with two challenges in methodology development: (1) loop boundaries in protein structures are frequently problematic in constructing length-dependent loop databases for protein loop predictions; (2) knowledge-based modeling of loops of unknown structure requires both aligning a query loop sequence to loop templates and ranking the loop sequence-template matches. RESULTS: We developed a knowledge-based loop prediction method that circumvents the need of constructing hierarchically clustered length-dependent loop libraries. The method first predicts local structural fragments of a query loop sequence and then structurally aligns the predicted structural fragments to a set of non-redundant loop structural templates regardless of the loop length. The sequence-template alignments are then quantitatively evaluated with an artificial neural network model trained on a set of predictions with known outcomes. Prediction accuracy benchmarks indicated that the novel procedure provided an alternative approach overcoming the challenges of knowledge-based loop prediction. AVAILABILITY: http://cmb.genomics.sinica.edu.tw  相似文献   

9.
In this study, we investigate the extent to which techniques for homology modeling that were developed for water-soluble proteins are appropriate for membrane proteins as well. To this end we present an assessment of current strategies for homology modeling of membrane proteins and introduce a benchmark data set of homologous membrane protein structures, called HOMEP. First, we use HOMEP to reveal the relationship between sequence identity and structural similarity in membrane proteins. This analysis indicates that homology modeling is at least as applicable to membrane proteins as it is to water-soluble proteins and that acceptable models (with C alpha-RMSD values to the native of 2 A or less in the transmembrane regions) may be obtained for template sequence identities of 30% or higher if an accurate alignment of the sequences is used. Second, we show that secondary-structure prediction algorithms that were developed for water-soluble proteins perform approximately as well for membrane proteins. Third, we provide a comparison of a set of commonly used sequence alignment algorithms as applied to membrane proteins. We find that high-accuracy alignments of membrane protein sequences can be obtained using state-of-the-art profile-to-profile methods that were developed for water-soluble proteins. Improvements are observed when weights derived from the secondary structure of the query and the template are used in the scoring of the alignment, a result which relies on the accuracy of the secondary-structure prediction of the query sequence. The most accurate alignments were obtained using template profiles constructed with the aid of structural alignments. In contrast, a simple sequence-to-sequence alignment algorithm, using a membrane protein-specific substitution matrix, shows no improvement in alignment accuracy. We suggest that profile-to-profile alignment methods should be adopted to maximize the accuracy of homology models of membrane proteins.  相似文献   

10.
Punta M  Maritan A 《Proteins》2003,50(1):114-121
In this article, a membrane-propensity scale for amino acids is derived using only two ingredients: (i) a set of transmembrane helices segments from membrane protein crystal structures and (ii) the request that each component of the set has a free energy lower than that of a typical soluble protein sequence of the same length. Although the most widely used hydropathy scales satisfy this request, we use an optimization procedure that allows for extraction of an optimal scale, which correlates equally well with those scales. We show that, if the choice of the sequence database is accurate, significant knowledge-based scales, which are robust with respect to changes in the learning set, can be easily derived. The obtained scales can be used for transmembrane helices prediction. The predictive power of one of these scales is tested on membrane proteins, soluble proteins, and signal peptides databases, finding that its performances is comparable with those of the hydropathy scales.  相似文献   

11.

Background  

The use of knowledge-based potential function is a powerful method for protein structure evaluation. A variety of formulations that evaluate single or multiple structural features of proteins have been developed and studied. The performance of functions is often evaluated by discrimination ability using decoy structures of target proteins. A function that can evaluate coarse-grained structures is advantageous from many aspects, such as relatively easy generation and manipulation of model structures; however, the reduction of structural representation is often accompanied by degradation of the structure discrimination performance.  相似文献   

12.
13.
A suite of FORTRAN programs, PREF, is described for calculating preference functions from the data base of known protein structures and for comparing smoothed profiles of sequence-dependent preferences in proteins of unknown structure. Amino acid preferences for a secondary structure are considered as functions of a sequence environment. Sequence environment of amino acid residue in a protein is defined as an average over some physical, chemical, or statistical property of its primary structure neighbors. The frequency distribution of sequence environments in the data base of soluble protein structures is approximately normal for each amino acid type of known secondary conformation. An analytical expression for the dependence of preferences on sequence environment is obtained after each frequency distribution is replaced by corresponding Gaussian function. The preference for the α-helical conformation increases for each amino acid type with the increase of sequence environment of buried solvent-accessible surface areas. We show that a set of preference functions based on buried surface area is useful for predicting folding motifs in α-class proteins and in integral membrane proteins. The prediction accuracy for helical residues is 79% for 5 integral membrane proteins and 74% for 11 α-class soluble proteins. Most residues found in transmembrane segments of membrane proteins with known α-helical structure are predicted to be indeed in the helical conformation because of very high middle helix preferences. Both extramembrane and transmembrane helices in the photosynthetic reaction center M and L subunits are correctly predicted. We point out in the discussion that our method of conformational preference functions can identify what physical properties of the amino acids are important in the formation of particular secondary structure elements. © 1993 John Wiley & Sons, Inc.  相似文献   

14.
In spite of the overwhelming numbers and critical biological functions of membrane proteins, only a few have been characterized by high-resolution structural techniques. From the structures that are known, it is seen that their transmembrane (TM) segments tend to fold most often into alpha-helices. To evaluate systematically the features of these TM segments, we have taken two approaches: (1) using the experimentally-measured residence behavior of specifically designed hydrophobic peptides in RP-HPLC, a scale was derived based directly on the properties of individual amino acids incorporated into membrane-interactive helices: and (2) the relative alpha-helical propensity of each of the 20 amino acids was measured in the organic non-polar environment of n-butanol. By combining the resulting hydrophobicity and helical propensity data, in conjunction with consideration of the 'threshold hydrophobicity' required for spontaneous membrane integration of protein segments, an approach was developed for prediction of TM segments wherein each must fulfill the dual requirements of hydrophobicity and helicity. Evaluated against the available high-resolution structural data on membrane proteins, the present combining method is shown to provide accurate predictions for the locations of TM helices. In contrast, no segment in soluble proteins was predicted as a 'TM helix'.  相似文献   

15.
Progress and challenges in protein structure prediction   总被引:2,自引:0,他引:2  
Depending on whether similar structures are found in the PDB library, the protein structure prediction can be categorized into template-based modeling and free modeling. Although threading is an efficient tool to detect the structural analogs, the advancements in methodology development have come to a steady state. Encouraging progress is observed in structure refinement which aims at drawing template structures closer to the native; this has been mainly driven by the use of multiple structure templates and the development of hybrid knowledge-based and physics-based force fields. For free modeling, exciting examples have been witnessed in folding small proteins to atomic resolutions. However, predicting structures for proteins larger than 150 residues still remains a challenge, with bottlenecks from both force field and conformational search.  相似文献   

16.
The structural genomics initiatives have begun with the aim to create a so-called "basic set library" of protein folds that will be used to improve protein prediction methods. Such a library is thought to require the determination of up to 10,000 new structures, including representative structures of several sequence variants from each protein fold. To meet this goal in a reasonable time frame and cost, automated systems must be utilized to clone and to identify the soluble recombinant proteins contained in multiple genomes. This paper presents such a system, developed using the genome of Caenorhabditis elegans (19,099 genes) as a model eukaryotic organism for structural genomics. This system successfully automates nearly all aspects of recombinant protein expression analysis including subcloning, bacterial growth, recombinant protein expression, protein purification, and scoring protein solubility.  相似文献   

17.
Given the known high-resolution structures of alpha-helical transmembrane domains, we show that there are statistically distinct classes of transmembrane interfaces which relate to the folding and oligomerization of transmembrane domains. Distinct types of interfaces have been categorized and refer to those between: the same polypeptide chain, different polypeptide chains, helices that are sequential neighbors, and those that are nonsequential. These different interfaces may reflect different phases in the mechanism of transmembrane domain folding and are consistent with the current experimental evidence pertaining to the folding and oligomerization of transmembrane domains. The classes of helix-helix interfaces have been identified in terms of the numbers and different types of pairwise amino acid interactions. The specific measures used are interaction entropy, the information content of interacting partners compared to a random set of contacts, the amino acid composition of the classes and the abundances of specific amino acid pairs in close contact. Knowledge of the clear differences in the types of helix-helix contacts helps with the derivation of knowledge-based constraints which until now have focused on only the interiors of transmembrane domains as compared to the exterior. Taken together, an in vivo model for membrane protein folding is presented, which is distinct from the familiar two-stage model. The model takes into account the different interfaces of membrane helices defined herein, and the available data regarding folding in the translocation channel.  相似文献   

18.
Pairs of helices in transmembrane (TM) proteins are often tightly packed. We present a scoring function and a computational methodology for predicting the tertiary fold of a pair of alpha-helices such that its chances of being tightly packed are maximized. Since the number of TM protein structures solved to date is small, it seems unlikely that a reliable scoring function derived statistically from the known set of TM protein structures will be available in the near future. We therefore constructed a scoring function based on the qualitative insights gained in the past two decades from the solved structures of TM and soluble proteins. In brief, we reward the formation of contacts between small amino acid residues such as Gly, Cys, and Ser, that are known to promote dimerization of helices, and penalize the burial of large amino acid residues such as Arg and Trp. As a case study, we show that our method predicts the native structure of the TM homodimer glycophorin A (GpA) to be, in essence, at the global score optimum. In addition, by correlating our results with empirical point mutations on this homodimer, we demonstrate that our method can be a helpful adjunct to mutation analysis. We present a data set of canonical alpha-helices from the solved structures of TM proteins and provide a set of programs for analyzing it (http://ashtoret.tau.ac.il/~sarel). From this data set we derived 11 helix pairs, and conducted searches around their native states as a further test of our method. Approximately 73% of our predictions showed a reasonable fit (RMS deviation <2A) with the native structures compared to the success rate of 8% expected by chance. The search method we employ is less effective for helix pairs that are connected via short loops (<20 amino acid residues), indicating that short loops may play an important role in determining the conformation of alpha-helices in TM proteins.  相似文献   

19.

Background

Although Transmembrane Proteins (TMPs) are highly important in various biological processes and pharmaceutical developments, general prediction of TMP structures is still far from satisfactory. Because TMPs have significantly different physicochemical properties from soluble proteins, current protein structure prediction tools for soluble proteins may not work well for TMPs. With the increasing number of experimental TMP structures available, template-based methods have the potential to become broadly applicable for TMP structure prediction. However, the current fold recognition methods for TMPs are not as well developed as they are for soluble proteins.

Methodology

We developed a novel TMP Fold Recognition method, TMFR, to recognize TMP folds based on sequence-to-structure pairwise alignment. The method utilizes topology-based features in alignment together with sequence profile and solvent accessibility. It also incorporates a gap penalty that depends on predicted topology structure segments. Given the difference between α-helical transmembrane protein (αTMP) and β-strands transmembrane protein (βTMP), parameters of scoring functions are trained respectively for these two protein categories using 58 αTMPs and 17 βTMPs in a non-redundant training dataset.

Results

We compared our method with HHalign, a leading alignment tool using a non-redundant testing dataset including 72 αTMPs and 30 βTMPs. Our method achieved 10% and 9% better accuracies than HHalign in αTMPs and βTMPs, respectively. The raw score generated by TMFR is negatively correlated with the structure similarity between the target and the template, which indicates its effectiveness for fold recognition. The result demonstrates TMFR provides an effective TMP-specific fold recognition and alignment method.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号