首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The question of how best to compare and classify the (three‐dimensional) structures of proteins is one of the most important unsolved problems in computational biology. To help tackle this problem, we have developed a novel shape‐density superposition algorithm called 3D‐Blast which represents and superposes the shapes of protein backbone folds using the spherical polar Fourier correlation technique originally developed by us for protein docking. The utility of this approach is compared with several well‐known protein structure alignment algorithms using receiver‐operator‐characteristic plots of queries against the “gold standard” CATH database. Despite being completely independent of protein sequences and using no information about the internal geometry of proteins, our results from searching the CATH database show that 3D‐Blast is highly competitive compared to current state‐of‐the‐art protein structure alignment algorithms. A novel and potentially very useful feature of our approach is that it allows an average or “consensus” fold to be calculated easily for a given group of protein structures. We find that using consensus shapes to represent entire fold families also gives very good database query performance. We propose that using the notion of consensus fold shapes could provide a powerful new way to index existing protein structure databases, and that it offers an objective way to cluster and classify all of the currently known folds in the protein universe. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

2.
M. F. Thorpe  S. Banu Ozkan 《Proteins》2015,83(12):2279-2292
The most successful protein structure prediction methods to date have been template‐based modeling (TBM) or homology modeling, which predicts protein structure based on experimental structures. These high accuracy predictions sometimes retain structural errors due to incorrect templates or a lack of accurate templates in the case of low sequence similarity, making these structures inadequate in drug‐design studies or molecular dynamics simulations. We have developed a new physics based approach to the protein refinement problem by mimicking the mechanism of chaperons that rehabilitate misfolded proteins. The template structure is unfolded by selectively (targeted) pulling on different portions of the protein using the geometric based technique FRODA, and then refolded using hierarchically restrained replica exchange molecular dynamics simulations (hr‐REMD). FRODA unfolding is used to create a diverse set of topologies for surveying near native‐like structures from a template and to provide a set of persistent contacts to be employed during re‐folding. We have tested our approach on 13 previous CASP targets and observed that this method of folding an ensemble of partially unfolded structures, through the hierarchical addition of contact restraints (that is, first local and then nonlocal interactions), leads to a refolding of the structure along with refinement in most cases (12/13). Although this approach yields refined models through advancement in sampling, the task of blind selection of the best refined models still needs to be solved. Overall, the method can be useful for improved sampling for low resolution models where certain of the portions of the structure are incorrectly modeled. Proteins 2015; 83:2279–2292. © 2015 Wiley Periodicals, Inc.  相似文献   

3.
Xu D  Zhang Y 《Proteins》2012,80(7):1715-1735
Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1-20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field.  相似文献   

4.
Ishida T  Nakamura S  Shimizu K 《Proteins》2006,64(4):940-947
We developed a novel knowledge-based residue environment potential for assessing the quality of protein structures in protein structure prediction. The potential uses the contact number of residues in a protein structure and the absolute contact number of residues predicted from its amino acid sequence using a new prediction method based on a support vector regression (SVR). The contact number of an amino acid residue in a protein structure is defined by the number of residues around a given residue. First, the contact number of each residue is predicted using SVR from an amino acid sequence of a target protein. Then, the potential of the protein structure is calculated from the probability distribution of the native contact numbers corresponding to the predicted ones. The performance of this potential is compared with other score functions using decoy structures to identify both native structure from other structures and near-native structures from nonnative structures. This potential improves not only the ability to identify native structures from other structures but also the ability to discriminate near-native structures from nonnative structures.  相似文献   

5.
6.
The principal bottleneck in protein structure prediction is the refinement of models from lower accuracies to the resolution observed by experiment. We developed a novel constraints‐based refinement method that identifies a high number of accurate input constraints from initial models and rebuilds them using restrained torsion angle dynamics (rTAD). We previously created a Bayesian statistics‐based residue‐specific all‐atom probability discriminatory function (RAPDF) to discriminate native‐like models by measuring the probability of accuracy for atom type distances within a given model. Here, we exploit RAPDF to score (i.e., filter) constraints from initial predictions that may or may not be close to a native‐like state, obtain consensus of top scoring constraints amongst five initial models, and compile sets with no redundant residue pair constraints. We find that this method consistently produces a large and highly accurate set of distance constraints from which to build refinement models. We further optimize the balance between accuracy and coverage of constraints by producing multiple structure sets using different constraint distance cutoffs, and note that the cutoff governs spatially near versus distant effects in model generation. This complete procedure of deriving distance constraints for rTAD simulations improves the quality of initial predictions significantly in all cases evaluated by us. Our procedure represents a significant step in solving the protein structure prediction and refinement problem, by enabling the use of consensus constraints, RAPDF, and rTAD for protein structure modeling and refinement. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

7.
We have developed a new approach for the analysis of interacting interfaces in protein complexes and protein quaternary structure based on cross-linking in the solid state. Protein complexes are freeze-dried under vacuum, and cross-links are introduced in the solid phase by dehydrating the protein in a nonaqueous solvent creating peptide bonds between amino and carboxyl groups of the interacting peptides. Cross-linked proteins are digested into peptides with trypsin in both H2(16)O and H(2)18O and then readily distinguished in mass spectra by characteristic 8 atomic mass unit (amu) shifts reflecting incorporation of two 18O atoms into each C terminus of proteolytic peptides. Computer analysis of mass spectrometry (MS) and MS/MS data is used to identify the cross-linked peptides. We demonstrated specificity and reproducibility of our method by cross-linking homo-oligomeric protein complexes of glutathione-S-transferase (GST) from Schistosoma japonicum alone or in a mixture of many other proteins. Identified cross-links were predominantly of amide origin, but six esters and thioesters were also found. The cross-linked peptides were validated against the GST monomer and dimer X-ray structures and by experimental (MS/MS) analyses. Some of the identified cross-links matched interacting peptides in the native 3D structure of GST, indicating that the structure of GST and its oligomeric complex remained primarily intact after freeze-drying. The pattern of oligomeric GST obtained in solid state was the same as that obtained in solution by Ru (II) Bpy(3)2+ catalyzed, oxidative "zero-length" cross-linking, confirming that it is feasible to use our strategy for analyzing the molecular interfaces of interacting proteins or peptides.  相似文献   

8.
9.
Newly determined protein structures are classified to belong to a new fold, if the structures are sufficiently dissimilar from all other so far known protein structures. To analyze structural similarities of proteins, structure alignment tools are used. We demonstrate that the usage of nonsequential structure alignment tools, which neglect the polypeptide chain connectivity, can yield structure alignments with significant similarities between proteins of known three-dimensional structure and newly determined protein structures that possess a new fold. The recently introduced protein structure alignment tool, GANGSTA, is specialized to perform nonsequential alignments with proper assignment of the secondary structure types by focusing on helices and strands only. In the new version, GANGSTA+, the underlying algorithms were completely redesigned, yielding enhanced quality of structure alignments, offering alignment against a larger database of protein structures, and being more efficient. We applied DaliLite, TM-align, and GANGSTA+ on three protein crystal structures considered to be novel folds. Applying GANGSTA+ to these novel folds, we find proteins in the ASTRAL40 database, which possess significant structural similarities, albeit the alignments are nonsequential and in some cases involve secondary structure elements aligned in reverse orientation. A web server is available at http://agknapp.chemie.fu-berlin.de/gplus for pairwise alignment, visualization, and database comparison.  相似文献   

10.
Recent advances in modeling protein structures at the atomic level have made it possible to tackle "de novo" computational protein design. Most procedures are based on combinatorial optimization using a scoring function that estimates the folding free energy of a protein sequence on a given main-chain structure. However, the computation of the conformational entropy in the folded state is generally an intractable problem, and its contribution to the free energy is not properly evaluated. In this article, we propose a new automated protein design methodology that incorporates such conformational entropy based on statistical mechanics principles. We define the free energy of a protein sequence by the corresponding partition function over rotamer states. The free energy is written in variational form in a pairwise approximation and minimized using the Belief Propagation algorithm. In this way, a free energy is associated to each amino acid sequence: we use this insight to rescore the results obtained with a standard minimization method, with the energy as the cost function. Then, we set up a design method that directly uses the free energy as a cost function in combination with a stochastic search in the sequence space. We validate the methods on the design of three superficial sites of a small SH3 domain, and then apply them to the complete redesign of 27 proteins. Our results indicate that accounting for entropic contribution in the score function affects the outcome in a highly nontrivial way, and might improve current computational design techniques based on protein stability.  相似文献   

11.
Membrane protein production for structural studies is often hindered by the formation of non-specific aggregates from which the protein has to be denatured and then refolded to a functional state. We developed a new approach, which uses microfluidics channels, to refold protein correctly in quantities sufficient for structural studies. Green fluorescent protein (GFP), a soluble protein, and bacteriorhodopsin (BR), a transmembrane protein, were used to demonstrate the efficiency of the process. Urea-denatured GFP refolded as the urea diffused away from the protein, forming in the channel a uniform fluorescent band when observed by confocal microscopy. Sodium dodecyl sulphate-denatured BR refolded within the channel on mixing with detergent–lipid mixed micelles. The refolding, monitored by absorbance spectroscopy, was found to be flow rate dependent. This potential of microfluidic reactors for screening protein-folding conditions and producing protein would be particularly amenable for high-throughput applications required in structural genomics. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

12.
It is generally believed that loop regions in globular proteins, and particularly hypervariable loops in immunoglobulins, can accommodate a wide variety of sequence changes without jeopardizing protein structure or stability. We show here, however, that novel sequences introduced within complementarity determining regions (CDRs) 1 and 3 of the immunoglobulin variable domain REI VL can significantly diminish the stability of the native state of this protein. Besides their implications for the general role of loops in the stability of globular proteins, these results suggest previously unrecognized stability constraints on the variability of CDRs that may impact efforts to engineer new and improved activities into antibodies.  相似文献   

13.
14.
The role of repeating motifs in protein structures is thought to be as modular building blocks which allow an economic way of constructing complex proteins. In this work novel wavelet transform analysis techniques are used to detect and characterize repeating motifs in protein sequence and structure data, where the Kyte-Doolittle hydrophobicity scale (Eta Phi) and relative accessible surface area (rASA) data provide residue information about the protein sequence and structure, respectively. We analyze a variety of repeating protein motifs, TIM barrels, propellor blades, coiled coils and leucine-rich repeat structures. Detection and characterization of these motifs is performed using techniques based on the continuous wavelet transform (CWT). Results indicate that the wavelet transform techniques developed herein are a promising approach for the detection and characterization of repeating motifs for both structural and in some instances sequence data.  相似文献   

15.
Beta‐amyloid peptide (Aβ) is the major protein constituent found in senile plaques in Alzheimer's disease (AD). It is believed that Aβ plays a role in neurodegeneration associated with AD and that its toxicity is related to its structure or aggregation state. In this study, an approach based on chemical modification of primary amines and mass spectrometric (MS) detection was used to identify residues on Aβ peptide that were exposed or buried upon changes in peptide structure associated with aggregation. Results indicate that the N terminus was the most accessible primary amine in the fibril, followed by lysine 28, then lysine 16. A kinetic analysis of the data was then performed to quantify differences in accessibility between these modification sites. We estimated apparent equilibrium unfolding constants for each modified site of the peptide, and determined that the unfolding constant for the N terminus was approximately 100 times greater than that for K28, which was about six times greater than that for K16. Understanding Aβ peptide structure at the residue level is a first step in designing novel therapies for prevention of Aβ structural transitions and/or cell interactions associated with neurotoxicity in Alzheimer's disease. Biotechnol. Bioeng. 2009; 104: 181–192 © 2009 Wiley Periodicals, Inc.  相似文献   

16.
Deng H  Jia Y  Wei Y  Zhang Y 《Proteins》2012,80(9):2311-2322
Many statistical potentials were developed in last two decades for protein folding and protein structure recognition. The major difference of these potentials is on the selection of reference states to offset sampling bias. However, since these potentials used different databases and parameter cutoffs, it is difficult to judge what the best reference states are by examining the original programs. In this study, we aim to address this issue and evaluate the reference states by a unified database and programming environment. We constructed distance-specific atomic potentials using six widely-used reference states based on 1022 high-resolution protein structures, which are applied to rank modeling in six sets of structure decoys. The reference state on random-walk chain outperforms others in three decoy sets while those using ideal-gas, quasi-chemical approximation and averaging sample stand out in one set separately. Nevertheless, the performance of the potentials relies on the origin of decoy generations and no reference state can clearly outperform others in all decoy sets. Further analysis reveals that the statistical potentials have a contradiction between the universality and pertinence, and optimal reference states should be extracted based on specific application environments and decoy spaces.  相似文献   

17.
膜蛋白的结构预测在目前比较困难.本文利用已建立的模式识别方法预测了三个典型的膜蛋白RC,BR和RH的二级结构,预测结果与实验资料的符合率与该方法用于球蛋白时的结果相仿,是成功的.本文进一步完善了模式识别预测蛋白质二级结构的方法.建立了针对球蛋白二级结构预测的多分类方法,预测精度大于60%.事实证明这是一种较好的结构预测方法,鉴于目前国内外运用模式识别方法进行结构预测研究的还不多见,我们拟进一步发展完善这一方法.  相似文献   

18.
Li X  Hu C  Liang J 《Proteins》2003,53(4):792-805
Protein representation and potential function are two important ingredients for studying protein folding, equilibrium thermodynamics, and sequence design. We introduce a novel geometric representation of protein contact interactions using the edge simplices from the alpha shape of the protein structure. This representation can eliminate implausible neighbors that are not in physical contact, and can avoid spurious contact between two residues when a third residue is between them. We developed statistical alpha contact potential using an odds-ratio model. A studentized bootstrap method was then introduced to assess the 95% confidence intervals for each of the 210 propensity parameters. We found, with confidence, that there is significant long-range propensity (>30 residues apart) for hydrophobic interactions. We tested alpha contact potential for native structure discrimination using several sets of decoy structures, and found that it often performs comparably with atom-based potentials requiring many more parameters. We also show that accurate geometric representation is important, and that alpha contact potential has better performance than potential defined by cutoff distance between geometric centers of side chains. Hierarchical clustering of alpha contact potentials reveals natural grouping of residues. To explore the relationship between shape and physicochemical representations, we tested the minimum alphabet size necessary for native structure discrimination. We found that there is no significant difference in performance of discrimination when alphabet size varies from 7 to 20, if geometry is represented accurately by alpha simplicial edges. This result suggests that the geometry of packing plays an important role, but the specific residue types are often interchangeable.  相似文献   

19.
20.
De novo protein design offers templates for engineering tailor‐made protein functions and orthogonal protein interaction networks for synthetic biology research. Various computational methods have been developed to introduce functional sites in known protein structures. De novo designed protein scaffolds provide further opportunities for functional protein design. Here we demonstrate the rational design of novel tumor necrosis factor alpha (TNFα) binding proteins using a home‐made grafting program AutoMatch. We grafted three key residues from a virus 2L protein to a de novo designed small protein, DS119, with consideration of backbone flexibility. The designed proteins bind to TNFα with micromolar affinities. We further optimized the interface residues with RosettaDesign and significantly improved the binding capacity of one protein Tbab1‐4. These designed proteins inhibit the activity of TNFα in cellular luciferase assays. Our work illustrates the potential application of the de novo designed protein DS119 in protein engineering, biomedical research, and protein sequence‐structure‐function studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号