首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 609 毫秒
1.
Macromolecular assemblies play an important role in all cellular processes. While there has recently been significant progress in protein structure prediction based on deep learning, large protein complexes cannot be predicted with these approaches. The integrative structure modeling approach characterizes multi-subunit complexes by computational integration of data from fast and accessible experimental techniques. Crosslinking mass spectrometry is one such technique that provides spatial information about the proximity of crosslinked residues. One of the challenges in interpreting crosslinking datasets is designing a scoring function that, given a structure, can quantify how well it fits the data. Most approaches set an upper bound on the distance between Cα atoms of crosslinked residues and calculate a fraction of satisfied crosslinks. However, the distance spanned by the crosslinker greatly depends on the neighborhood of the crosslinked residues. Here, we design a deep learning model for predicting the optimal distance range for a crosslinked residue pair based on the structures of their neighborhoods. We find that our model can predict the distance range with the area under the receiver-operator curve of 0.86 and 0.7 for intra- and inter-protein crosslinks, respectively. Our deep scoring function can be used in a range of structure modeling applications.  相似文献   

2.
Characterizing the three-dimensional structure of macromolecules is central to understanding their function. Traditionally, structures of proteins and their complexes have been determined using experimental techniques such as X-ray crystallography, NMR, or cryo-electron microscopy—applied individually or in an integrative manner. Meanwhile, however, computational methods for protein structure prediction have been improving their accuracy, gradually, then suddenly, with the breakthrough advance by AlphaFold2, whose models of monomeric proteins are often as accurate as experimental structures. This breakthrough foreshadows a new era of computational methods that can build accurate models for most monomeric proteins. Here, we envision how such accurate modeling methods can combine with experimental structural biology techniques, enhancing integrative structural biology. We highlight the challenges that arise when considering multiple structural conformations, protein complexes, and polymorphic assemblies. These challenges will motivate further developments, both in modeling programs and in methods to solve experimental structures, towards better and quicker investigation of structure–function relationships.  相似文献   

3.
MOTIVATION: Despite the continuing advance in the experimental determination of protein structures, the gap between the number of known protein sequences and structures continues to increase. Prediction methods can bridge this sequence-structure gap only partially. Better predictions of non-local contacts between residues could improve comparative modeling, fold recognition and could assist in the experimental structure determination. RESULTS: Here, we introduced PROFcon, a novel contact prediction method that combines information from alignments, from predictions of secondary structure and solvent accessibility, from the region between two residues and from the average properties of the entire protein. In contrast to some other methods, PROFcon predicted short and long proteins at similar levels of accuracy. As expected, PROFcon was clearly less accurate when tested on sparse evolutionary profiles, that is, on families with few homologs. Prediction accuracy was highest for proteins belonging to the SCOP alpha/beta class. PROFcon compared favorably with state-of-the-art prediction methods at the CASP6 meeting. While the performance may still be perceived as low, our method clearly pushed the mark higher. Furthermore, predictions are already accurate enough to seed predictions of global features of protein structure.  相似文献   

4.
Protein structure prediction methods such as Rosetta search for the lowest energy conformation of the polypeptide chain. However, the experimentally observed native state is at a minimum of the free energy, rather than the energy. The neglect of the missing configurational entropy contribution to the free energy can be partially justified by the assumption that the entropies of alternative folded states, while very much less than unfolded states, are not too different from one another, and hence can be to a first approximation neglected when searching for the lowest free energy state. The shortcomings of current structure prediction methods may be due in part to the breakdown of this assumption. Particularly problematic are proteins with significant disordered regions which do not populate single low energy conformations even in the native state. We describe two approaches within the Rosetta structure modeling methodology for treating such regions. The first does not require advance knowledge of the regions likely to be disordered; instead these are identified by minimizing a simple free energy function used previously to model protein folding landscapes and transition states. In this model, residues can be either completely ordered or completely disordered; they are considered disordered if the gain in entropy outweighs the loss of favorable energetic interactions with the rest of the protein chain. The second approach requires identification in advance of the disordered regions either from sequence alone using for example the DISOPRED server or from experimental data such as NMR chemical shifts. During Rosetta structure prediction calculations the disordered regions make only unfavorable repulsive contributions to the total energy. We find that the second approach has greater practical utility and illustrate this with examples from de novo structure prediction, NMR structure calculation, and comparative modeling.  相似文献   

5.
The lysine-specific crosslinker 3,3'-dithiobis(sulfosuccinimidylpropionate) (DTSSP) is commonly used in the structural characterization of proteins by chemical crosslinking and mass spectrometry and we here describe an efficient two-step LC-MALDI-TOF/TOF procedure to detect crosslinked peptides. First MS data are acquired, and the properties of isotope-labeled DTSSP are used in data analysis to identify candidate crosslinks. MSMS data are then acquired for a restricted number of precursor ions per spot for final crosslink identification. We show that the thiol-catalyzed exchange between crosslinked peptides, which is due to the disulfide bond in DTSSP and known to possibly obscure data, can be precisely quantified using isotope-labeled DTSSP. Crosslinked peptides are recognized as 8 Da doublet peaks and a new isotopic peak with twice the intensity appears in the middle of the doublet as a consequence of the thiol-exchange. False-positive crosslinks, formed exclusively by thiol-exchange, yield a 1:2:1 isotope pattern, whereas true crosslinks, formed by two lysine residues within crosslinkable distance in the native protein structure, yield a 1:0:1 isotope pattern. Peaks with a 1:X:1 isotope pattern, where 0 < X < 2, can be trusted as true crosslinks, with a defined proportion of the signal [2X/(2 + X)] being noise from the thiol-exchange. The thiol-exchange was correlated with the protein cysteine content and was minimized by shortening the trypsin incubation time, and for two molecular chaperone proteins with known structure all crosslinks fitted well to the structure data. The thiol-exchange can thus be controlled and isotope-labeled DTSSP safely used to detect true crosslinks between lysine residues in proteins.  相似文献   

6.
Crystallography and NMR system (CNS) is currently a widely used method for fragment-free ab initio protein folding from inter-residue distance or contact maps. Despite its widespread use in protein structure prediction, CNS is a decade-old macromolecular structure determination system that was originally developed for solving macromolecular geometry from experimental restraints as opposed to predictive modeling driven by interaction map data. As such, the adaptation of the CNS experimental structure determination protocol for ab initio protein folding is intrinsically anomalous that may undermine the folding accuracy of computational protein structure prediction. In this paper, we propose a new CNS-free hierarchical structure modeling method called DConStruct for folding both soluble and membrane proteins driven by distance and contact information. Rigorous experimental validation shows that DConStruct attains much better reconstruction accuracy than CNS when tested with the same input contact map at varying contact thresholds. The hierarchical modeling with iterative self-correction employed in DConStruct scales at a much higher degree of folding accuracy than CNS with the increase in contact thresholds, ultimately approaching near-optimal reconstruction accuracy at higher-thresholded contact maps. The folding accuracy of DConStruct can be further improved by exploiting distance-based hybrid interaction maps at tri-level thresholding, as demonstrated by the better performance of our method in folding free modeling targets from the 12th and 13th rounds of the Critical Assessment of techniques for protein Structure Prediction (CASP) experiments compared to popular CNS- and fragment-based approaches and energy-minimization protocols, some of which even using much finer-grained distance maps than ours. Additional large-scale benchmarking shows that DConStruct can significantly improve the folding accuracy of membrane proteins compared to a CNS-based approach. These results collectively demonstrate the feasibility of greatly improving the accuracy of ab initio protein folding by optimally exploiting the information encoded in inter-residue interaction maps beyond what is possible by CNS.  相似文献   

7.
Integrative structural biology attempts to model the structures of protein complexes that are challenging or intractable by classical structural methods (due to size, dynamics, or heterogeneity) by combining computational structural modeling with data from experimental methods. One such experimental method is chemical crosslinking mass spectrometry (XL‐MS), in which protein complexes are crosslinked and characterized using liquid chromatography‐mass spectrometry to pinpoint specific amino acid residues in close structural proximity. The commonly used lysine‐reactive N‐hydroxysuccinimide ester reagents disuccinimidylsuberate (DSS) and bis(sulfosuccinimidyl)suberate (BS3) have a linker arm that is 11.4 Å long when fully extended, allowing Cα (alpha carbon of protein backbone) atoms of crosslinked lysine residues to be up to ~24 Å apart. However, XL‐MS studies on proteins of known structure frequently report crosslinks that exceed this distance. Typically, a tolerance of ~3 Å is added to the theoretical maximum to account for this observation, with limited justification for the chosen value. We used the Dynameomics database, a repository of high‐quality molecular dynamics simulations of 807 proteins representative of diverse protein folds, to investigate the relationship between lysine–lysine distances in experimental starting structures and in simulation ensembles. We conclude that for DSS/BS3, a distance constraint of 26–30 Å between Cα atoms is appropriate. This analysis provides a theoretical basis for the widespread practice of adding a tolerance to the crosslinker length when comparing XL‐MS results to structures or in modeling. We also discuss the comparison of XL‐MS results to MD simulations and known structures as a means to test and validate experimental XL‐MS methods.  相似文献   

8.
Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and their complexes is important for understanding the fundamental principles and specific aspects of protein interactions. The accuracy of protein modeling, in general, is still less than that of the experimental approaches. Thus, it is important to investigate the applicability of docking techniques to modeled proteins. We present new comprehensive benchmark sets of protein models for the development and validation of protein docking, as well as a systematic assessment of free and template-based docking techniques on these sets. As opposed to previous studies, the benchmark sets reflect the real case modeling/docking scenario where the accuracy of the models is assessed by the modeling procedure, without reference to the native structure (which would be unknown in practical applications). We also expanded the analysis to include docking of protein pairs where proteins have different structural accuracy. The results show that, in general, the template-based docking is less sensitive to the structural inaccuracies of the models than the free docking. The near-native docking poses generated by the template-based approach, typically, also have higher ranks than those produces by the free docking (although the free docking is indispensable in modeling the multiplicity of protein interactions in a crowded cellular environment). The results show that docking techniques are applicable to protein models in a broad range of modeling accuracy. The study provides clear guidelines for practical applications of docking to protein models.  相似文献   

9.
Photo- and chemical crosslinking of proteins have offered various avenues for studying protein structure and protein interactions with biomolecules. Conventional photoactivatable groups generally lack reaction selectivity toward amino acid residues. New photoactivatable groups reacting with selected residues have emerged recently, increasing crosslinking efficiency and facilitating crosslink identification. Traditional chemical crosslinking usually employs highly reactive functional groups, while recent advance has developed latent reactive groups with reactivity triggered by proximity, which reduce spurious crosslinks and improve biocompatibility. The employment of these residue selective chemical functional groups, activated by light or proximity, in small molecule crosslinkers and in genetically encoded unnatural amino acids is summarized. Together with new software development in identifying protein crosslinks, residue selective crosslinking has enhanced the research of elusive protein-protein interactions in vitro, in cell lysate, and in live cells. Residue selective crosslinking is expected to expand to other methods for the investigation of various protein–biomolecule interactions.  相似文献   

10.
Many proteins are composed of several domains that pack together into a complex tertiary structure. Multidomain proteins can be challenging for protein structure modeling, particularly those for which templates can be found for individual domains but not for the entire sequence. In such cases, homology modeling can generate high quality models of the domains but not for the orientations between domains. Small-angle X-ray scattering (SAXS) reports the structural properties of entire proteins and has the potential for guiding homology modeling of multidomain proteins. In this article, we describe a novel multidomain protein assembly modeling method, SAXSDom that integrates experimental knowledge from SAXS with probabilistic Input-Output Hidden Markov model to assemble the structures of individual domains together. Four SAXS-based scoring functions were developed and tested, and the method was evaluated on multidomain proteins from two public datasets. Incorporation of SAXS information improved the accuracy of domain assembly for 40 out of 46 critical assessment of protein structure prediction multidomain protein targets and 45 out of 73 multidomain protein targets from the ab initio domain assembly dataset. The results demonstrate that SAXS data can provide useful information to improve the accuracy of domain-domain assembly. The source code and tool packages are available at https://github.com/jianlin-cheng/SAXSDom .  相似文献   

11.
Protein chemical shifts encode detailed structural information that is difficult and computationally costly to describe at a fundamental level. Statistical and machine learning approaches have been used to infer correlations between chemical shifts and secondary structure from experimental chemical shifts. These methods range from simple statistics such as the chemical shift index to complex methods using neural networks. Notwithstanding their higher accuracy, more complex approaches tend to obscure the relationship between secondary structure and chemical shift and often involve many parameters that need to be trained. We present hidden Markov models (HMMs) with Gaussian emission probabilities to model the dependence between protein chemical shifts and secondary structure. The continuous emission probabilities are modeled as conditional probabilities for a given amino acid and secondary structure type. Using these distributions as outputs of first‐ and second‐order HMMs, we achieve a prediction accuracy of 82.3%, which is competitive with existing methods for predicting secondary structure from protein chemical shifts. Incorporation of sequence‐based secondary structure prediction into our HMM improves the prediction accuracy to 84.0%. Our findings suggest that an HMM with correlated Gaussian distributions conditioned on the secondary structure provides an adequate generative model of chemical shifts. Proteins 2013; © 2012 Wiley Periodicals, Inc.  相似文献   

12.
CASP (critical assessment of structure prediction) assesses the state of the art in modeling protein structure from amino acid sequence. The most recent experiment (CASP13 held in 2018) saw dramatic progress in structure modeling without use of structural templates (historically “ab initio” modeling). Progress was driven by the successful application of deep learning techniques to predict inter-residue distances. In turn, these results drove dramatic improvements in three-dimensional structure accuracy: With the proviso that there are an adequate number of sequences known for the protein family, the new methods essentially solve the long-standing problem of predicting the fold topology of monomeric proteins. Further, the number of sequences required in the alignment has fallen substantially. There is also substantial improvement in the accuracy of template-based models. Other areas—model refinement, accuracy estimation, and the structure of protein assemblies—have again yielded interesting results. CASP13 placed increased emphasis on the use of sparse data together with modeling and chemical crosslinking, SAXS, and NMR all yielded more mature results. This paper summarizes the key outcomes of CASP13. The special issue of PROTEINS contains papers describing the CASP13 assessments in each modeling category and contributions from the participants.  相似文献   

13.
Accurate protein structure prediction remains an active objective of research in bioinformatics. Membrane proteins comprise approximately 20% of most genomes. They are, however, poorly tractable targets of experimental structure determination. Their analysis using bioinformatics thus makes an important contribution to their on-going study. Using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we have addressed the alignment-free discrimination of membrane from non-membrane proteins. The method successfully identifies prokaryotic and eukaryotic alpha-helical membrane proteins at 94.4% accuracy, beta-barrel proteins at 72.4% accuracy, and distinguishes assorted non-membranous proteins with 85.9% accuracy. The method here is an important potential advance in the computational analysis of membrane protein structure. It represents a useful tool for the characterisation of membrane proteins with a wide variety of potential applications.  相似文献   

14.
《Biophysical journal》2022,121(18):3508-3519
Site-directed spin-labeling electron paramagnetic resonance spectroscopy is a powerful technique for the investigation of protein structure and dynamics. Accurate spin-label modeling methods are essential to make full quantitative use of site-directed spin-labeling electron paramagnetic resonance data for protein modeling and model validation. Using a set of double electron-electron resonance data from seven different site pairs on maltodextrin/maltose-binding protein under two different conditions using five different spin labels, we compare the ability of two widely used spin-label modeling methods, based on accessible volume sampling and rotamer libraries, to predict experimental distance distributions. We present a spin-label modeling approach inspired by canonical side-chain modeling methods and compare modeling accuracy with the established methods.  相似文献   

15.
The functional characterization of proteins represents a daily challenge for biochemical, medical and computational sciences. Although finally proved on the bench, the function of a protein can be successfully predicted by computational approaches that drive the further experimental assays. Current methods for comparative modeling allow the construction of accurate 3D models for proteins of unknown structure, provided that a crystal structure of a homologous protein is available. Binding regions can be proposed by using binding site predictors, data inferred from homologous crystal structures, and data provided from a careful interpretation of the multiple sequence alignment of the investigated protein and its homologs. Once the location of a binding site has been proposed, chemical ligands that have a high likelihood of binding can be identified by using ligand docking and structure-based virtual screening of chemical libraries. Most docking algorithms allow building a list sorted by energy of the lowest energy docking configuration for each ligand of the library. In this review the state-of-the-art of computational approaches in 3D protein comparative modeling and in the study of protein–ligand interactions is provided. Furthermore a possible combined/concerted multistep strategy for protein function prediction, based on multiple sequence alignment, comparative modeling, binding region prediction, and structure-based virtual screening of chemical libraries, is described by using suitable examples. As practical examples, Abl-kinase molecular modeling studies, HPV-E6 protein multiple sequence alignment analysis, and some other model docking-based characterization reports are briefly described to highlight the importance of computational approaches in protein function prediction.  相似文献   

16.
In an attempt to identify endogenous chemicals producing DNA-protein crosslinks, we have studied in vitro crosslinking potential of malondialdehyde, a bifunctional chemical that is ubiquitously formed as a product of lipid peroxidation of polyunsaturated fatty acids. We have found that malondialdehyde readily forms crosslinks between DNA and histones under physiological ionic and pH conditions. Formation of DNA-protein crosslinks was limited to proteins that were able to bind to DNA. Malondialdehyde failed to form DNA-protein crosslinks when histone binding to DNA was prevented by elevated ionic strength or when bovine serum albumin was used in the reaction mixture. Malondialdehyde-produced DNA-histone crosslinks were relatively stable at 37 degrees C with t1/2=13.4 days. Crosslinking of histones to DNA proceeds through the initial formation of protein adduct followed by reaction with DNA. Modification of DNA by malondialdehyde does not lead to a subsequent crosslinking of proteins. Significant formation of DNA-protein crosslinks was also registered in isolated kidney and liver nuclei treated with malondialdehyde. Based on its reactivity and stability of the resulting crosslinks, it is suggested that malondialdehyde could be one of the significant sources of endogenous DNA-protein crosslinks.  相似文献   

17.
The accuracy of sequence-based tertiary contact predictions was assessed in a blind prediction experiment at the CASP13 meeting. After 4 years of significant improvements in prediction accuracy, another dramatic advance has taken place since CASP12 was held 2 years ago. The precision of predicting the top L/5 contacts in the free modeling category, where L is the corresponding length of the protein in residues, has exceeded 70%. As a comparison, the best-performing group at CASP12 with a 47% precision would have finished below the top 1/3 of the CASP13 groups. Extensively trained deep neural network approaches dominate the top performing algorithms, which appear to efficiently integrate information on coevolving residues and interacting fragments or possibly utilize memories of sequence similarities and sometimes can deliver accurate results even in the absence of virtually any target specific evolutionary information. If the current performance is evaluated by F-score on L contacts, it stands around 24% right now, which, despite the tremendous impact and advance in improving its utility for structure modeling, also suggests that there is much room left for further improvement.  相似文献   

18.
基于模板的蛋白结构预测和不依赖模板的蛋白结构预测是计算预测蛋白质三维结构的两种方法,前者由于具有快速和较高准确性的优点,而得到了广泛的应用.基于模板的结构预测是通过寻找与目标蛋白序列相似并且有实验测定的结构作为模板,进而构建目标序列的结构模型的方法.文章详细综述了基于模板的结构预测方法的步骤、关键环节,并对影响结构预测...  相似文献   

19.
Crosslinking mass spectrometry captures protein structures in solution. The crosslinks reveal spatial proximities as distance restraints, but do not easily reveal which of these restraints derive from the same protein conformation. This superposition can be reduced by photo-crosslinking, and adding information from protein structure models, or quantitative crosslinking reveals conformation-specific crosslinks. As a consequence, crosslinking MS has proven useful already in the context of multiple dynamic protein systems. We foresee a breakthrough in the resolution and scale of studying protein dynamics when crosslinks are used to guide deep-learning-based protein modelling. Advances in crosslinking MS, such as photoactivatable crosslinking and in-situ crosslinking, will then reveal protein conformation dynamics in the cellular context, at a pseudo-atomic resolution, and plausibly in a time-resolved manner.  相似文献   

20.
Chemical crosslinking‐mass spectrometry (XL‐MS) is a valuable technique for gaining insights into protein structure and the organization of macromolecular complexes. XL‐MS data yield inter‐residue restraints that can be compared with high‐resolution structural data. Distances greater than the crosslinker spacer‐arm can reveal lowly populated “excited” states of proteins/protein assemblies, or crosslinks can be used as restraints to generate structural models in the absence of structural data. Despite increasing uptake of XL‐MS, there are few tools to enable rapid and facile mapping of XL‐MS data onto high‐resolution structures or structural models. PyXlinkViewer is a user‐friendly plugin for PyMOL v2 that maps intra‐protein, inter‐protein, and dead‐end crosslinks onto protein structures/models and automates the calculation of inter‐residue distances for the detected crosslinks. This enables rapid visualization of XL‐MS data, assessment of whether a set of detected crosslinks is congruent with structural data, and easy production of high‐quality images for publication.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号