首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
A computational method for NMR-constrained protein threading.   总被引:2,自引:0,他引:2  
Protein threading provides an effective method for fold recognition and backbone structure prediction. But its application is currently limited due to its level of prediction accuracy and scope of applicability. One way to significantly improve its usefulness is through the incorporation of underconstrained (or partial) NMR data. It is well known that the NMR method for protein structure determination applies only to small proteins and that its effectiveness decreases rapidly as the protein mass increases beyond about 30 kD. We present, in this paper, a computational framework for applying underconstrained NMR data (that alone are insufficient for structure determination) as constraints in protein threading and also in all-atom model construction. In this study, we consider both secondary structure assignments from chemical shifts and NOE distance restraints. Our results have shown that both secondary structure assignments and a small number of long-range NOEs can significantly improve the threading quality in both fold recognition and threading-alignment accuracy, and can possibly extend threading's scope of applicability from homologs to analogs. An accurate backbone structure generated by NMR-constrained threading can then provide a great amount of structural information, equivalent to that provided by many NMR data; and hence can help reduce the number of NMR data typically required for an accurate structure determination. This new technique can potentially accelerate current NMR structure determination processes and possibly expand NMR's capability to larger proteins.  相似文献   

Protein threading using PROSPECT: design and evaluation   总被引:14,自引:0,他引:14  
Xu Y  Xu D 《Proteins》2000,40(3):343-354
The computer system PROSPECT for the protein fold recognition using the threading method is described and evaluated in this article. For a given target protein sequence and a template structure, PROSPECT guarantees to find a globally optimal threading alignment between the two. The scoring function for a threading alignment employed in PROSPECT consists of four additive terms: i) a mutation term, ii) a singleton fitness term, iii) a pairwise-contact potential term, and iv) alignment gap penalties. The current version of PROSPECT considers pair contacts only between core (alpha-helix or beta-strand) residues and alignment gaps only in loop regions. PROSPECT finds a globally optimal threading efficiently when pairwise contacts are considered only between residues that are spatially close (7 A or less between the C(beta) atoms in the current implementation). On a test set consisting of 137 pairs of target-template proteins, each pair being from the same superfamily and having sequence identity 相似文献   

Yang YD  Park C  Kihara D 《Proteins》2008,73(3):581-596
Optimizing weighting factors for a linear combination of terms in a scoring function is a crucial step for success in developing a threading algorithm. Usually weighting factors are optimized to yield the highest success rate on a training dataset, and the determined constant values for the weighting factors are used for any target sequence. Here we explore completely different approaches to handle weighting factors for a scoring function of threading. Throughout this study we use a model system of gapless threading using a scoring function with two terms combined by a weighting factor, a main chain angle potential and a residue contact potential. First, we demonstrate that the optimal weighting factor for recognizing the native structure differs from target sequence to target sequence. Then, we present three novel threading methods which circumvent training dataset-based weighting factor optimization. The basic idea of the three methods is to employ different weighting factor values and finally select a template structure for a target sequence by examining characteristics of the distribution of scores computed by using the different weighting factor values. Interestingly, the success rate of our approaches is comparable to the conventional threading method where the weighting factor is optimized based on a training dataset. Moreover, when the size of the training set available for the conventional threading method is small, our approach often performs better. In addition, we predict a target-specific weighting factor optimal for a target sequence by an artificial neural network from features of the target sequence. Finally, we show that our novel methods can be used to assess the confidence of prediction of a conventional threading with an optimized constant weighting factor by considering consensus prediction between them. Implication to the underlined energy landscape of protein folding is discussed.  相似文献   

Betancourt MR 《Proteins》2003,53(4):889-907
A protein model that is simple enough to be used in protein-folding simulations but accurate enough to identify a protein native fold is described. Its geometry consists of describing the residues by one, two, or three pseudoatoms, depending on the residue size. Its energy is given by a pairwise, knowledge-based potential obtained for all the pseudoatoms as a function of their relative distance. The pseudoatomic potential is also a function of the primary chain separation and residue order. The model is tested by gapless threading on a large, representative set of known protein and decoy structures obtained from the "Decoys 'R' Us" database. It is also tested by threading on gapped decoys generated for proteins with many homologs. The gapless threading tests show near 98% native-structure recognition as the lowest energy structure and almost 100% as one of the three lowest energy structures for over 2200 test proteins. In decoy threading tests, the model recognized the majority of the native structures. It is also able to recognize native structures among gapped decoys, in spite of close structural similarities. The results indicate that the pseudoatomic model has native recognition ability similar to comparable atomic-based models but much better than equivalent residue-based models.  相似文献   

Prediction of the disulfide-bonding state of cysteine in proteins   总被引:5,自引:0,他引:5  
The bonding states of cysteine play important functional and structural roles in proteins. In particular, disulfide bond formation is one of the most important factors influencing the three-dimensional fold of proteins. Proteins of known structure were used to teach computer-simulated neural networks rules for predicting the disulfide-bonding state of a cysteine given only its flanking amino acid sequence. Resulting networks make accurate predictions on sequences different from those used in training, suggesting that local sequence greatly influences cysteines in disulfide bond formation. The average prediction rate after seven independent network experiments is 81.4% for disulfide-bonded and 80.0% for non-disulfide-bonded scenarios. Predictive accuracy is related to the strength of network output activities. Network weights reveal interesting position-dependent amino acid preferences and provide a physical basis for understanding the correlation between the flanking sequence and a cysteine's disulfide-bonding state. Network predictions may be used to increase or decrease the stability of existing disulfide bonds or to aid the search for potential sites to introduce new disulfide bonds.  相似文献   

Multibody potentials have been of much interest recently because they take into account three dimensional interactions related to residue packing and capture the cooperativity of these interactions in protein structures. Our goal was to combine long range multibody potentials and short range potentials to improve recognition of native structure among misfolded decoys. We optimized the weights for four-body nonsequential, four-body sequential, and short range potentials to obtain optimal model ranking results for threading and have compared these data against results obtained with other potentials (26 different coarse-grained potentials from the Potentials 'R'Us web server have been used). Our optimized multibody potentials outperform all other contact potentials in the recognition of the native structure among decoys, both for models from homology template-based modeling and from template-free modeling in CASP8 decoy sets. We have compared the results obtained for this optimized coarse-grained potentials, where each residue is represented by a single point, with results obtained by using the DFIRE potential, which takes into account atomic level information of proteins. We found that for all proteins larger than 80 amino acids our optimized coarse-grained potentials yield results comparable to those obtained with the atomic DFIRE potential.  相似文献   

The burial of native disulfide bonds, formed within stable structure in the regeneration of multi-disulfide-containing proteins from their fully reduced states, is a key step in the folding process, as the burial greatly accelerates the oxidative folding rate of the protein by sequestering the native disulfide bonds from thiol-disulfide exchange reactions. Nevertheless, several proteins retain solvent-exposed disulfide bonds in their native structures. Here, we have examined the impact of an easily reducible native disulfide bond on the oxidative folding rate of a protein. Our studies reveal that the susceptibility of the (40-95) disulfide bond of Y92G bovine pancreatic ribonuclease A (RNase A) to reduction results in a reduced rate of oxidative regeneration, compared with wild-type RNase A. In the native state of RNase A, Tyr 92 lies atop its (40-95) disulfide bond, effectively shielding this bond from the reducing agent, thereby promoting protein oxidative regeneration. Our work sheds light on the unique contribution of a local structural element in promoting the oxidative folding of a multi-disulfide-containing protein.  相似文献   

We investigate the possibility that atomic burials, as measured by their distances from the structural geometrical center, contain sufficient information to determine the tertiary structure of globular proteins. We report Monte Carlo simulated annealing results of all-atom hard-sphere models in continuous space for four small proteins: the all-beta WW-domain 1E0L, the alpha/beta protein-G 1IGD, the all-alpha engrailed homeo-domain 1ENH, and the alpha + beta engineered monomeric form of the Cro protein 1ORC. We used as energy function the sum over all atoms, labeled by i, of |R(i) - R(i) (*)|, where R(i) is the atomic distance from the center of coordinates, or central distance, and R(i) (*) is the "ideal" central distance obtained from the native structure. Hydrogen bonds were taken into consideration by the assignment of two ideal distances for backbone atoms forming hydrogen bonds in the native structure depending on the formation of a geometrically defined bond, independently of bond partner. Lowest energy final conformations turned out to be very similar to the native structure for the four proteins under investigation and a strong correlation was observed between energy and distance root mean square deviation (DRMS) from the native in the case of all-beta 1E0L and alpha/beta 1IGD. For all alpha 1ENH and alpha + beta 1ORC the overall correlation between energy and DRMS among final conformations was not as high because some trajectories resulted in high DRMS but low energy final conformations in which alpha-helices adopted a non-native mutual orientation. Comparison between central distances and actual accessible surface areas corroborated the implicit assumption of correlation between these two quantities. The Z-score obtained with this native-centric potential in the discrimination of native 1ORC from a set of random compact structures confirmed that it contains a much smaller amount of native information when compared to a traditional contact Go potential but indicated that simple sequence-dependent burial potentials still need some improvement in order to attain a similar discriminability. Taken together, our results suggest that central distances, in conjunction to physically motivated hydrogen bond constraints, contain sufficient information to determine the native conformation of these small proteins and that a solution to the folding problem for globular proteins could arise from sufficiently accurate burial predictions from sequence followed by minimization of a burial-dependent energy function.  相似文献   

We present an analysis of 10 blind predictions prepared for a recent conference, “Critical Assessment of Techniques for Protein Structure Prediction.”1 The sequences of these proteins are not detectably similar to those of any protein in the structure database then available, but we attempted, by a threading method, to recognize similarity to known domain folds. Four of the 10 proteins, as we subsequently learned, do indeed show significant similarity to then-known structures. For 2 of these proteins the predictions were accurate, in the sense that a similar structure was at or near the top of the list of threading scores, and the threading alignment agreed well with the corresponding structural alignment. For the best predicted model mean alignment error relative to the optimal structural alignment was 2.7 residues, arising entirely from small “register shifts” of strands or helices. In the analysis we attempt to identify factors responsible for these successes and failures. Since our threading method does not use gap penalties, we may readily distinguish between errors arising from our prior definition of the “cores” of known structures and errors arising from inherent limitations in the threading potential. It would appear from the results that successful substructure recognition depends most critically on accurate definition of the “fold” of a database protein. This definition must correctly delineate substructures that are, and are not, likely to be conserved during protein evolution. © 1995 Wiley-Liss, Inc.  相似文献   

Kim D  Xu D  Guo JT  Ellrott K  Xu Y 《Protein engineering》2003,16(9):641-650
A new method for fold recognition is developed and added to the general protein structure prediction package PROSPECT (http://compbio.ornl.gov/PROSPECT/). The new method (PROSPECT II) has four key features. (i) We have developed an efficient way to utilize the evolutionary information for evaluating the threading potentials including singleton and pairwise energies. (ii) We have developed a two-stage threading strategy: (a) threading using dynamic programming without considering the pairwise energy and (b) fold recognition considering all the energy terms, including the pairwise energy calculated from the dynamic programming threading alignments. (iii) We have developed a combined z-score scheme for fold recognition, which takes into consideration the z-scores of each energy term. (iv) Based on the z-scores, we have developed a confidence index, which measures the reliability of a prediction and a possible structure-function relationship based on a statistical analysis of a large data set consisting of threadings of 600 query proteins against the entire FSSP templates. Tests on several benchmark sets indicate that the evolutionary information and other new features of PROSPECT II greatly improve the alignment accuracy. We also demonstrate that the performance of PROSPECT II on fold recognition is significantly better than any other method available at all levels of similarity. Improvement in the sensitivity of the fold recognition, especially at the superfamily and fold levels, makes PROSPECT II a reliable and fully automated protein structure and function prediction program for genome-scale applications.  相似文献   

There are several knowledge-based energy functions that can distinguish the native fold from a pool of grossly misfolded decoys for a given sequence of amino acids. These decoys, which are typically generated by mounting, or “threading”, the sequence onto the backbones of unrelated protein structures, tend to be non-compact and quite different from the native structure: the root-mean-squared (RMS) deviations from the native are commonly in the range of 15 to 20 Å. Effective energy functions should also demonstrate a similar recognition capability when presented with compact decoys that depart only slightly in conformation from the correct structure (i.e. those with RMS deviations of ∼5 Å or less). Recently, we developed a simple yet powerful method for native fold recognition based on the tendency for native folds to form hydrophobic cores. Our energy measure, which we call the hydrophobic fitness score, is challenged to recognize the native fold from 2000 near-native structures generated for each of five small monomeric proteins. First, 1000 conformations for each protein were generated by molecular dynamics simulation at room temperature. The average RMS deviation of this set of 5000 was 1.5 Å. A total of 323 decoys had energies lower than native; however, none of these had RMS deviations greater than 2 Å. Another 1000 structures were generated for each at high temperature, in which a greater range of conformational space was explored (4.3 Å average RMS deviation). Out of this set, only seven decoys were misrecognized. The hydrophobic fitness energy of a conformation is strongly dependent upon the RMS deviation. On average our potential yields energy values which are lowest for the population of structures generated at room temperature, intermediate for those produced at high temperature and highest for those constructed by threading methods. In general, the lowest energy decoy conformations have backbones very close to native structure. The possible utility of our method for screening backbone candidates for the purpose of modelling by side-chain packing optimization is discussed.  相似文献   

Protein structure prediction is limited by the inaccuracy of the simplified energy functions necessary for efficient sorting over many conformations. It was recently suggested (Finkelstein, Phys Rev Lett 1998;80:4823-4825) that these errors can be reduced by energy averaging over a set of homologous sequences. This conclusion is confirmed in this study by testing protein structure recognition in gapless threading. The accuracy of recognition was estimated by the Z-score values obtained in gapless threading tests. For threading, we used 20 target proteins, each having from 20 to 70 homologs taken from the HSSP sequence base. The energy of the native structures was compared with the energy from 34 to 75 thousand of alternative structures generated by threading. The energy calculations were done with our recently developed Calpha atom-based phenomenological potentials. We show that averaging of protein energies over homologs reduces the Z-score from approximately -6.1 (average Z-score for individual chains) to approximately -8.1. This means that a correct fold can be found among 3 x 10(9) random folds in the first case and among 3 x 10(15) in the second. Such increase in selectivity is important for recognition of protein folds.  相似文献   

The structures of two species of potato carboxypeptidase inhibitor with nonnative disulfide bonds were determined by molecular dynamics simulations in explicit solvent using disulfide bond constraints that have been shown to work for the native species. Ten structures were determined; five for scrambled A (disulfide bonds between Cys8-Cys27, Cys12-Cys18, and Cys24-Cys34) and five for the scrambled C (disulfide bonds Cys8-Cys24, Cys12-Cys18, and Cys27-Cys34). The two scrambled species were both more solvent exposed than the native structure; the scrambled C species was more solvent exposed and less compact than the scrambled A species. Analysis of the loop regions indicates that certain loops in scrambled C are more nativelike than in scrambled A. These factors, combined with the fact that scrambled C has one native disulfide bond, may contribute to the observed faster conversion to the native structure from scrambled C than from scrambled A. Results from the PROCHECK program using the standard parameter database and a database specially constructed for small, disulfide-rich proteins indicate that the 10 scrambled structures have correct stereochemistry. Further, the results show that a characteristic feature of small, disulfide-rich proteins is that they score poorly using the standard PROCHECK parameter database. Proteins 2000;40:482-493.  相似文献   

Granulins (GRNs) are a family of small (~6 kDa) proteins generated by the proteolytic processing of their precursor, progranulin (PGRN), in many cell types. Both PGRN and GRNs are implicated in a plethora of biological functions, often in opposing roles to each other. Lately, GRNs have generated significant attention due to their implicated roles in neurodegenerative disorders. Despite their physiological and pathological significance, the structure‐function relationships of GRNs are poorly defined. GRNs contain 12 conserved cysteines forming six intramolecular disulfide bonds, making them rather exceptional, even among a few proteins with high disulfide bond density. Solution NMR investigations in the past have revealed a unique structure containing putative interdigitated disulfide bonds for several GRNs, but GRN‐3 was unsolvable due to its heterogeneity and disorder. In our previous report, we showed that abrogation of disulfide bonds in GRN‐3 renders the protein completely disordered (Ghag et al., Prot Eng Des Sel 2016). In this study, we report the cellular expression and biophysical analysis of fully oxidized, native GRN‐3. Our results indicate that both E. coli and human embryonic kidney (HEK) cells do not exclusively make GRN‐3 with homogenous disulfide bonds, likely due to the high cysteine density within the protein. Biophysical analysis suggests that GRN‐3 structure is dominated by irregular loops held together only by disulfide bonds, which induced remarkable thermal stability to the protein despite the lack of regular secondary structure. This unusual handshake between disulfide bonds and disorder within GRN‐3 could suggest a unique adaptation of intrinsically disordered proteins towards structural stability.  相似文献   

We have revisited the protein coarse-grained optimized potential for efficient structure prediction (OPEP). The training and validation sets consist of 13 and 16 protein targets. Because optimization depends on details of how the ensemble of decoys is sampled, trial conformations are generated by molecular dynamics, threading, greedy, and Monte Carlo simulations, or taken from publicly available databases. The OPEP parameters are varied by a genetic algorithm using a scoring function which requires that the native structure has the lowest energy, and the native-like structures have energy higher than the native structure but lower than the remote conformations. Overall, we find that OPEP correctly identifies 24 native or native-like states for 29 targets and has very similar capability to the all-atom discrete optimized protein energy model (DOPE), found recently to outperform five currently used energy models.  相似文献   

Shan Y  Wang G  Zhou HX 《Proteins》2001,42(1):23-37
A homology-based structure prediction method ideally gives both a correct fold assignment and an accurate query-template alignment. In this article we show that the combination of two existing methods, PSI-BLAST and threading, leads to significant enhancement in the success rate of fold recognition. The combined approach, termed COBLATH, also yields much higher alignment accuracy than found in previous studies. It consists of two-way searches both by PSI-BLAST and by threading. In the PSI-BLAST portion, a query is used to search for hits in a library of potential templates and, conversely, each potential template is used to search for hits in a library of queries. In the threading portion, the scoring function is the sum of a sequence profile and a 6x6 substitution matrix between predicted query and known template secondary structure and solvent exposure. "Two-way" in threading means that the query's sequence profile is used to match the sequences of all potential templates and the sequence profiles of all potential templates are used to match the query's sequence. When tested on a set of 533 nonhomologous proteins, COBLATH was able to assign folds for 390 (73%). Among these 390 queries, 265 (68%) had root-mean-square deviations (RMSDs) of less than 8 A between predicted and actual structures. Such high success rate and accuracy make COBLATH an ideal tool for structural genomics.  相似文献   

The formation of protein disulfide bonds in the Escherichia coli periplasm by the enzyme DsbA is an inaccurate process. Many eukaryotic proteins with nonconsecutive disulfide bonds expressed in E. coli require an additional protein for proper folding, the disulfide bond isomerase DsbC. Here we report studies on a native E. coli periplasmic acid phosphatase, phytase (AppA), which contains three consecutive and one nonconsecutive disulfide bonds. We show that AppA requires DsbC for its folding. However, the activity of an AppA mutant lacking its nonconsecutive disulfide bond is DsbC-independent. An AppA homolog, Agp, a periplasmic acid phosphatase with similar structure, lacks the nonconsecutive disulfide bond but has the three consecutive disulfide bonds found in AppA. The consecutively disulfide-bonded Agp is not dependent on DsbC but is rendered dependent by engineering into it the conserved nonconsecutive disulfide bond of AppA. Taken together, these results provide support for the proposal that proteins with nonconsecutive disulfide bonds require DsbC for full activity and that disulfide bonds are formed predominantly during translocation across the cytoplasmic membrane.  相似文献   

Chen W  Mirny L  Shakhnovich EI 《Proteins》2003,51(4):531-543
Here we present a simplified form of threading that uses only a 20 x 20 two-body residue-based potential and restricted number of gaps. Despite its simplicity and transparency the Monte Carlo-based threading algorithm performs very well in a rigorous test of fold recognition. The results suggest that by simplifying and constraining the decoy space, one can achieve better fold recognition. Fold recognition results are compared with and supplemented by a PSI-BLAST search. The statistical significance of threading results is rigorously evaluated from statistics of extremes by comparison with optimal alignments of a large set of randomly shuffled sequences. The statistical theory, based on the Random Energy Model, yields a cumulative statistical parameter, epsilon, that attests to the likelihood of correct fold recognition. A large epsilon indicates a significant energy gap between the optimal alignment and decoy alignments and, consequently, a high probability that the fold is correctly recognized. For a particular number of gaps, the epsilon parameter reaches its maximal value, and the fold is recognized. As the number of gaps further increases, the likelihood of correct fold recognition drops off. This is because the decoy space is small when gaps are restricted to a small number, but the native alignment is still well approximated, whereas unrestricted increase of the number of gaps leads to rapid growth of the number of decoys and their statistical dominance over the correct alignment. It is shown that best results are obtained when a combination of one-, two-, and three-gap threading is used. To this end, use of the epsilon parameter is crucial for rigorous comparison of results across the different decoy spaces belonging to a different number of gaps.  相似文献   

Disulfide bonds play diverse structural and functional roles in proteins. In tear lipocalin (TL), the conserved sole disulfide bond regulates stability and ligand binding. Probing protein structure often involves thiol selective labeling for which removal of the disulfide bonds may be necessary. Loss of the disulfide bond may destabilize the protein so strategies to retain the native state are needed. Several approaches were tested to regain the native conformational state in the disulfide-less protein. These included the addition of trimethylamine N-oxide (TMAO) and the substitution of the Cys residues of disulfide bond with residues that can either form a potential salt bridge or others that can create a hydrophobic interaction. TMAO stabilized the protein relaxed by removal of the disulfide bond. In the disulfide-less mutants of TL, 1.0 M TMAO increased the free energy change (ΔG0) significantly from 2.1 to 3.8 kcal/mol. Moderate recovery was observed for the ligand binding tested with NBD-cholesterol. Because the disulfide bond of TL is solvent exposed, the substitution of the disulfide bond with a potential salt bridge or hydrophobic interaction did not stabilize the protein. This approach should work for buried disulfide bonds. However, for proteins with solvent exposed disulfide bonds, the use of TMAO may be an excellent strategy to restore the native conformational states in disulfide-less analogs of the proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号