首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Jun Gao  Zhijun Li 《Biopolymers》2010,93(4):340-347
It is widely accepted that a protein's sequence determines its structure. The surprising finding that proteins of distant sequence can adopt similar 3D structures has raised interesting questions regarding underlying conserved properties that are essential for protein folding and stability. Uncovering the conserved properties may shed light on the folding mechanism of proteins and help with the development of computational tools for protein structure prediction. We compiled and analyzed a structure pair dataset of 66 high‐resolution and low sequence identity (16–38%) soluble proteins. Structure deviation for each pair was confirmed by calculating its Cα SiMax value and comparing its potential energy per residue. Analysis of favorable inter‐residue interactions for each structure pair indicated that the average number of inter‐residue interactions within each structure represents a conserved feature of homologous structures of distant sequence. Detailed comparison of individual types of interactions showed that the average number of either hydrophobic or hydrogen bonding interactions remains unchanged for each structure pair. These findings should be of help to improving the quality of homology models based on templates of low sequence identity, thus broadening the application of homology modeling techniques for protein studies. © 2009 Wiley Periodicals, Inc. Biopolymers 93: 340–347, 2010. This article was originally published online as an accepted preprint. The “Published Online” date corresponds to the preprint version. You can request a copy of the preprint by emailing the Biopolymers editorial office at biopolymers@wiley.com  相似文献   

3.
4.
The principal bottleneck in protein structure prediction is the refinement of models from lower accuracies to the resolution observed by experiment. We developed a novel constraints‐based refinement method that identifies a high number of accurate input constraints from initial models and rebuilds them using restrained torsion angle dynamics (rTAD). We previously created a Bayesian statistics‐based residue‐specific all‐atom probability discriminatory function (RAPDF) to discriminate native‐like models by measuring the probability of accuracy for atom type distances within a given model. Here, we exploit RAPDF to score (i.e., filter) constraints from initial predictions that may or may not be close to a native‐like state, obtain consensus of top scoring constraints amongst five initial models, and compile sets with no redundant residue pair constraints. We find that this method consistently produces a large and highly accurate set of distance constraints from which to build refinement models. We further optimize the balance between accuracy and coverage of constraints by producing multiple structure sets using different constraint distance cutoffs, and note that the cutoff governs spatially near versus distant effects in model generation. This complete procedure of deriving distance constraints for rTAD simulations improves the quality of initial predictions significantly in all cases evaluated by us. Our procedure represents a significant step in solving the protein structure prediction and refinement problem, by enabling the use of consensus constraints, RAPDF, and rTAD for protein structure modeling and refinement. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

5.
Many statistical measures and algorithmic techniques have been proposed for studying residue coupling in protein families. Generally speaking, two residue positions are considered coupled if, in the sequence record, some of their amino acid type combinations are significantly more common than others. While the proposed approaches have proven useful in finding and describing coupling, a significant missing component is a formal probabilistic model that explicates and compactly represents the coupling, integrates information about sequence,structure, and function, and supports inferential procedures for analysis, diagnosis, and prediction.We present an approach to learning and using probabilistic graphical models of residue coupling. These models capture significant conservation and coupling constraints observable ina multiply-aligned set of sequences. Our approach can place a structural prior on considered couplings, so that all identified relationships have direct mechanistic explanations. It can also incorporate information about functional classes, and thereby learn a differential graphical model that distinguishes constraints common to all classes from those unique to individual classes.Such differential models separately account for class-specific conservation and family-wide coupling, two different sources of sequence covariation. They are then able to perform interpretable functional classification of new sequences, explaining classification decisions in terms of the underlying conservation and coupling constraints. We apply our approach in studies of both G protein-coupled receptors and PDZ domains, identifying and analyzing family-wide and class-specific constraints, and performing functional classification. The results demonstrate that graphical models of residue coupling provide a powerful tool for uncovering, representing, and utilizing significant sequence structure-function relationships in protein families.  相似文献   

6.
This paper develops an approach for designing protein variants by sampling sequences that satisfy residue constraints encoded in an undirected probabilistic graphical model. Due to evolutionary pressures on proteins to maintain structure and function, the sequence record of a protein family contains valuable information regarding position-specific residue conservation and coupling (or covariation) constraints. Representing these constraints with a graphical model provides two key benefits for protein design: a probabilistic semantics enabling evaluation of possible sequences for consistency with the constraints, and an explicit factorization of residue dependence and independence supporting efficient exploration of the constrained sequence space. We leverage these benefits in developing two complementary MCMC algorithms for protein design: constrained shuffling mixes wild-type sequences positionwise and evaluates graphical model likelihood, while component sampling directly generates sequences by sampling clique values and propagating to other cliques. We apply our methods to design WW domains. We demonstrate that likelihood under a model of wild-type WWs is highly predictive of foldedness of new WWs. We then show both theoretical and rapid empirical convergence of our algorithms in generating high-likelihood, diverse new sequences. We further show that these sequences capture the original sequence constraints, yielding a model as predictive of foldedness as the original one.  相似文献   

7.
In addition to the well‐established sense‐antisense complementarity abundantly present in the nucleic acid world and serving as a basic principle of the specific double‐helical structure of DNA, production of mRNA, and genetic code‐based biosynthesis of proteins, sense‐antisense complementarity is also present in proteins, where sense and antisense peptides were shown to interact with each other with increased probability. In nucleic acids, sense‐antisense complementarity is achieved via the Watson‐Crick complementarity of the base pairs or nucleotide pairing. In proteins, the complementarity between sense and antisense peptides depends on a specific hydropathic pattern, where codons for hydrophilic and hydrophobic amino acids in a sense peptide are complemented by the codons for hydrophobic and hydrophilic amino acids in its antisense counterpart. We are showing here that in addition to this pattern of the complementary hydrophobicity, sense and antisense peptides are characterized by the complementary order‐disorder patterns and show complementarity in sequence distribution of their disorder‐based interaction sites. We also discuss how this order‐disorder complementarity can be related to protein evolution.  相似文献   

8.
Designing a protein sequence that will fold into a predefined structure is of both practical and fundamental interest. Many successful, computational designs in the last decade resulted from improved understanding of hydrophobic and polar interactions between side chains of amino acid residues in stabilizing protein tertiary structures. However, the coupling between main‐chain backbone structure and local sequence has yet to be fully addressed. Here, we attempt to account for such coupling by using a sequence profile derived from the sequences of five residue fragments in a fragment library that are structurally matched to the five‐residue segments contained in a target structure. We further introduced a term to reduce low complexity regions of designed sequences. These two terms together with optimized reference states for amino‐acid residues were implemented in the RosettaDesign program. The new method, called RosettaDesign‐SR, makes a 12% increase (from 34 to 46%) in fraction of proteins whose designed sequences are more than 35% identical to wild‐type sequences. Meanwhile, it reduces 8% (from 22% to 14%) to the number of designed sequences that are not homologous to any known protein sequences according to psi‐blast. More importantly, the sequences designed by RosettaDesign‐SR have 2–3% more polar residues at the surface and core regions of proteins and these surface and core polar residues have about 4% higher sequence identity to wild‐type sequences than by RosettaDesign. Thus, the proteins designed by RosettaDesign‐SR should be less likely to aggregate and more likely to have unique structures due to more specific polar interactions. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

9.
Protein‐protein interactions control a large range of biological processes and their identification is essential to understand the underlying biological mechanisms. To complement experimental approaches, in silico methods are available to investigate protein‐protein interactions. Cross‐docking methods, in particular, can be used to predict protein binding sites. However, proteins can interact with numerous partners and can present multiple binding sites on their surface, which may alter the binding site prediction quality. We evaluate the binding site predictions obtained using complete cross‐docking simulations of 358 proteins with 2 different scoring schemes accounting for multiple binding sites. Despite overall good binding site prediction performances, 68 cases were still associated with very low prediction quality, presenting individual area under the specificity‐sensitivity ROC curve (AUC) values below the random AUC threshold of 0.5, since cross‐docking calculations can lead to the identification of alternate protein binding sites (that are different from the reference experimental sites). For the large majority of these proteins, we show that the predicted alternate binding sites correspond to interaction sites with hidden partners, that is, partners not included in the original cross‐docking dataset. Among those new partners, we find proteins, but also nucleic acid molecules. Finally, for proteins with multiple binding sites on their surface, we investigated the structural determinants associated with the binding sites the most targeted by the docking partners.  相似文献   

10.
11.
Post‐translational modifications (PTM) of proteins can control complex and dynamic cellular processes via regulating interactions between key proteins. To understand these regulatory mechanisms, it is critical that we can profile the PTM‐dependent protein–protein interactions. However, identifying these interactions can be very difficult using available approaches, as PTMs can be dynamic and often mediate relatively weak protein–protein interactions. We have recently developed CLASPI (cross‐linking‐assisted and stable isotope labeling in cell culture‐based protein identification), a chemical proteomics approach to examine protein–protein interactions mediated by methylation in human cell lysates. Here, we report three extensions of the CLASPI approach. First, we show that CLASPI can be used to analyze methylation‐dependent protein–protein interactions in lysates of fission yeast, a genetically tractable model organism. For these studies, we examined trimethylated histone H3 lysine‐9 (H3K9Me3)‐dependent protein–protein interactions. Second, we demonstrate that CLASPI can be used to examine phosphorylation‐dependent protein–protein interactions. In particular, we profile proteins recognizing phosphorylated histone H3 threonine‐3 (H3T3‐Phos), a mitotic histone “mark” appearing exclusively during cell division. Our approach identified survivin, the only known H3T3‐Phos‐binding protein, as well as other proteins, such as MCAK and KIF2A, that are likely to be involved in weak but selective interactions with this histone phosphorylation “mark”. Finally, we demonstrate that the CLASPI approach can be used to study the interplay between histone H3T3‐Phos and trimethylation on the adjacent residue lysine 4 (H3K4Me3). Together, our findings indicate the CLASPI approach can be broadly applied to profile protein–protein interactions mediated by PTMs.  相似文献   

12.
Jun Gao  Zhijun Li 《Biopolymers》2009,91(7):547-556
Studying inter‐residue interactions provides insight into the folding and stability of both soluble and membrane proteins and is essential for developing computational tools for protein structure prediction. As the first step, various approaches for elucidating such interactions within protein structures have been proposed and proven useful. Since different approaches may grasp different aspects of protein structural folds, it is of interest to systematically compare them. In this work, we applied four approaches for determining inter‐residue interactions to the analysis of three distinct structure datasets of helical membrane proteins and compared their correlation to the three individual quality measures of structures in these datasets. These datasets included one of 35 structures of rhodopsin receptors and bacterial rhodopsins determined at various resolutions, one derived from the HOMEP benchmark dataset previously reported, and one comprising of 139 homology models. It was found that the correlation between the average number of inter‐residue interactions obtained by applying the four approaches and the available structure quality measures varied quite significantly among them. The best correlation was achieved by the approach focusing exclusively on favorable inter‐residue interactions. These results provide interesting insight for the development of objective quality measure for the structure prediction of helical membrane proteins. © 2009 Wiley Periodicals, Inc. Biopolymers 91: 547–556, 2009. This article was originally published online as an accepted preprint. The “Published Online” date corresponds to the preprint version. You can request a copy of the preprint by emailing the Biopolymers editorial office at biopolymers@wiley.com  相似文献   

13.
Despite the important role of the carboxyl‐terminus (Ct) of the activated brain cannabinoid receptor one (CB1) in the regulation of G protein signaling, a structural understanding of interactions with G proteins is lacking. This is largely due to the highly flexible nature of the CB1 Ct that dynamically adapts its conformation to the presence of G proteins. In the present study, we explored how the CB1 Ct can interact with the G protein by building on our prior modeling of the CB1‐Gi complex (Shim, Ahn, and Kendall, The Journal of Biological Chemistry 2013;288:32449–32465) to incorporate a complete CB1 Ct (Glu416Ct–Leu472Ct). Based on the structural constraints from NMR studies, we employed ROSETTA to predict tertiary folds, ZDOCK to predict docking orientation, and molecular dynamics (MD) simulations to obtain two distinct plausible models of CB1 Ct in the CB1‐Gi complex. The resulting models were consistent with the NMR‐determined helical structure (H9) in the middle region of the CB1 Ct. The CB1 Ct directly interacted with both Gα and Gβ and stabilized the receptor at the Gi interface. The results of site‐directed mutagenesis studies of Glu416Ct, Asp423Ct, Asp428Ct, and Arg444Ct of CB1 Ct suggested that the CB1 Ct can influence receptor‐G protein coupling by stabilizing the receptor at the Gi interface. This research provided, for the first time, models of the CB1 Ct in contact with the G protein. Proteins 2016; 84:532–543. © 2016 Wiley Periodicals, Inc.  相似文献   

14.
Hafumi Nishi  Motonori Ota 《Proteins》2010,78(6):1563-1574
Despite similarities in their sequence and structure, there are a number of homologous proteins that adopt various oligomeric states. Comparisons of these homologous protein pairs, in terms of residue substitutions at the protein–protein interfaces, have provided fundamental characteristics that describe how proteins interact with each other. We have prepared a dataset composed of pairs of related proteins with different homo‐oligomeric states. Using the protein complexes, the interface residues were identified, and using structural alignments, the shadow‐interface residues have been defined as the surface residues that align with the interface residues. Subsequently, we investigated residue substitutions between the interfaces and the shadow interfaces. Based on the degree of the contributions to the interactions, the aligned sites of the interfaces and shadow interfaces were divided into primary and secondary sites; the primary sites are the focus of this work. The primary sites were further classified into two groups (i.e. exposed and buried) based on the degree to which the residue is buried within the shadow interfaces. Using these classifications, two simple mechanisms that mediate the oligomeric states were identified. In the primary‐exposed sites, the residues on the shadow interfaces are replaced by more hydrophobic or aromatic residues, which are physicochemically favored at protein–protein interfaces. In the primary‐buried sites, the residues on the shadow interfaces are replaced by larger residues that protrude into other proteins. These simple rules are satisfied in 23 out of 25 Structural Classification of Proteins (SCOP) families with a different‐oligomeric‐state pair, and thus represent a basic strategy for modulating protein associations and dissociations. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

15.
We are developing a rapid, time‐resolved method using laser‐activated cross‐linking to capture protein‐peptide interactions as a means to interrogate the interaction of serum proteins as delivery systems for peptides and other molecules. A model system was established to investigate the interactions between bovine serum albumin (BSA) and 2 peptides, the tridecapeptide budding‐yeast mating pheromone (α‐factor) and the decapeptide human gonadotropin‐releasing hormone (GnRH). Cross‐linking of α‐factor, using a biotinylated, photoactivatable p‐benzoyl‐L‐phenylalanine (Bpa)–modified analog, was energy‐dependent and achieved within seconds of laser irradiation. Protein blotting with an avidin probe was used to detect biotinylated species in the BSA‐peptide complex. The cross‐linked complex was trypsinized and then interrogated with nano‐LC–MS/MS to identify the peptide cross‐links. Cross‐linking was greatly facilitated by Bpa in the peptide, but some cross‐linking occurred at higher laser powers and high concentrations of a non‐Bpa–modified α‐factor. This was supported by experiments using GnRH, a peptide with sequence homology to α‐factor, which was likewise found to be cross‐linked to BSA by laser irradiation. Analysis of peptides in the mass spectra showed that the binding site for both α‐factor and GnRH was in the BSA pocket defined previously as the site for fatty acid binding. This model system validates the use of laser‐activation to facilitate cross‐linking of Bpa‐containing molecules to proteins. The rapid cross‐linking procedure and high performance of MS/MS to identify cross‐links provides a method to interrogate protein‐peptide interactions in a living cell in a time‐resolved manner.  相似文献   

16.
Biological processes are commonly controlled by precise protein‐protein interactions. These connections rely on specific amino acids at the binding interfaces. Here we predict the binding residues of such interprotein complexes. We have developed a suite of methods, i‐Patch, which predict the interprotein contact sites by considering the two proteins as a network, with residues as nodes and contacts as edges. i‐Patch starts with two proteins, A and B, which are assumed to interact, but for which the structure of the complex is not available. However, we assume that for each protein, we have a reference structure and a multiple sequence alignment of homologues. i‐Patch then uses the propensities of patches of residues to interact, to predict interprotein contact sites. i‐Patch outperforms several other tested algorithms for prediction of interprotein contact sites. It gives 59% precision with 20% recall on a blind test set of 31 protein pairs. Combining the i‐Patch scores with an existing correlated mutation algorithm, McBASC, using a logistic model gave little improvement. Results from a case study, on bacterial chemotaxis protein complexes, demonstrate that our predictions can identify contact residues, as well as suggesting unknown interfaces in multiprotein complexes. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

17.
Site‐specific chemical cross‐linking in combination with mass spectrometry analysis has emerged as a powerful proteomic approach for studying the three‐dimensional structure of protein complexes and in mapping protein–protein interactions (PPIs). Building on the success of MS analysis of in vitro cross‐linked proteins, which has been widely used to investigate specific interactions of bait proteins and their targets in various organisms, we report a workflow for in vivo chemical cross‐linking and MS analysis in a multicellular eukaryote. This approach optimizes the in vivo protein cross‐linking conditions in Arabidopsis thaliana, establishes a MudPIT procedure for the enrichment of cross‐linked peptides, and develops an integrated software program, exhaustive cross‐linked peptides identification tool (ECL), to identify the MS spectra of in planta chemical cross‐linked peptides. In total, two pairs of in vivo cross‐linked peptides of high confidence have been identified from two independent biological replicates. This work demarks the beginning of an alternative proteomic approach in the study of in vivo protein tertiary structure and PPIs in multicellular eukaryotes.  相似文献   

18.
Haipeng Gong 《Proteins》2017,85(12):2162-2169
Helix‐helix interactions are crucial in the structure assembly, stability and function of helix‐rich proteins including many membrane proteins. In spite of remarkable progresses over the past decades, the accuracy of predicting protein structures from their amino acid sequences is still far from satisfaction. In this work, we focused on a simpler problem, the prediction of helix‐helix interactions, the results of which could facilitate practical protein structure prediction by constraining the sampling space. Specifically, we started from the noisy 2D residue contact maps derived from correlated residue mutations, and utilized ridge detection to identify the characteristic residue contact patterns for helix‐helix interactions. The ridge information as well as a few additional features were then fed into a machine learning model HHConPred to predict interactions between helix pairs. In an independent test, our method achieved an F‐measure of ~60% for predicting helix‐helix interactions. Moreover, although the model was trained mainly using soluble proteins, it could be extended to membrane proteins with at least comparable performance relatively to previous approaches that were generated purely using membrane proteins. All data and source codes are available at http://166.111.152.91/Downloads.html or https://github.com/dpxiong/HHConPred .  相似文献   

19.
Understanding the conformational propensities of proteins is key to solving many problems in structural biology and biophysics. The co‐variation of pairs of mutations contained in multiple sequence alignments of protein families can be used to build a Potts Hamiltonian model of the sequence patterns which accurately predicts structural contacts. This observation paves the way to develop deeper connections between evolutionary fitness landscapes of entire protein families and the corresponding free energy landscapes which determine the conformational propensities of individual proteins. Using statistical energies determined from the Potts model and an alignment of 2896 PDB structures, we predict the propensity for particular kinase family proteins to assume a “DFG‐out” conformation implicated in the susceptibility of some kinases to type‐II inhibitors, and validate the predictions by comparison with the observed structural propensities of the corresponding proteins and experimental binding affinity data. We decompose the statistical energies to investigate which interactions contribute the most to the conformational preference for particular sequences and the corresponding proteins. We find that interactions involving the activation loop and the C‐helix and HRD motif are primarily responsible for stabilizing the DFG‐in state. This work illustrates how structural free energy landscapes and fitness landscapes of proteins can be used in an integrated way, and in the context of kinase family proteins, can potentially impact therapeutic design strategies.  相似文献   

20.
Correlated mutation analyses (CMA) on multiple sequence alignments are widely used for the prediction of the function of amino acids. The accuracy of CMA‐based predictions is mainly determined by the number of sequences, by their evolutionary distances, and by the quality of the alignments. These criteria are best met in structure‐based sequence alignments of large super‐families. So far, CMA‐techniques have mainly been employed to study the receptor interactions. The present work shows how a novel CMA tool, called Comulator, can be used to determine networks of functionally related residues in enzymes. These analyses provide leads for protein engineering studies that are directed towards modification of enzyme specificity or activity. As proof of concept, Comulator has been applied to four enzyme super‐families: the isocitrate lyase/phoshoenol‐pyruvate mutase super‐family, the hexokinase super‐family, the RmlC‐like cupin super‐family, and the FAD‐linked oxidases super‐family. In each of those cases networks of functionally related residue positions were discovered that upon mutation influenced enzyme specificity and/or activity as predicted. We conclude that CMA is a powerful tool for redesigning enzyme activity and selectivity. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号